This is an archive of the discontinued LLVM Phabricator instance.

CodeGen: Remove AliasAnalysis from regalloc
ClosedPublic

Authored by arsenm on Jun 25 2022, 6:22 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
qcolombet
MatzeB
paquette
aemerson
efriedma
asbirlea
craig.topper

Summary

This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.

Diff Detail

Event Timeline

arsenm created this revision.Jun 25 2022, 6:22 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2022, 6:22 AM

Herald added subscribers: nlopes, mtrofin, jsji and 18 others. · View Herald Transcript

arsenm requested review of this revision.Jun 25 2022, 6:22 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2022, 6:22 AM

Herald added subscribers: aheejin, wdng. · View Herald Transcript

Harbormaster completed remote builds in B172021: Diff 439983.Jun 25 2022, 6:23 AM

mtrofin added inline comments.Jun 25 2022, 7:13 AM

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp
888	I think this isn't needed here.
892	this can be removed, since calculateRegAllocScore now doesn't need the AA

Remove another leftover use

Harbormaster completed remote builds in B172178: Diff 440198.Jun 27 2022, 6:29 AM

lkail added a subscriber: lkail.Jun 27 2022, 7:07 AM

I always found it very unfortunate that we keep IR references around in MIR to do alias analysis queries. This appears to remove a lot of those users (all but the ones in the schedule graph construction?), so I highly welcome the change!

Do you think that there is a risk of target-specific ISel or optimiziation not setting the flags properly in this new scheme? Though even if there is risk, I'd rather push this forward this is too nice a cleanup to block it ;-)

Thanks, LGTM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
1616–1633	Nit: I like initializing things in every branch without a default, so you get warnings if you forget to set the value in one of the branches...

This revision is now accepted and ready to land.Jun 27 2022, 10:38 AM

Maybe we should leave this diff open for a day or two to wait for extra feedback?

In D128583#3612915, @MatzeB wrote:

Do you think that there is a risk of target-specific ISel or optimiziation not setting the flags properly in this new scheme? Though even if there is risk, I'd rather push this forward this is too nice a cleanup to block it ;-)

Yes, this could potentially regress rematerializing instructions from target load intrinsics. We would have to pass AA through getTgtMemIntrinsic to allow targets to set the flag themselves.

Yes, this could potentially regress rematerializing instructions from target load intrinsics. We would have to pass AA through getTgtMemIntrinsic to allow targets to set the flag themselves.

Ah good point. I think we should at least have an issue filed about that API then.

The general approach here seems fine. I suspect we end up calling pointsToConstantMemory for most memory operands at some point anyway.

In general pointsToConstantMemory doesn't imply the pointer is dereferenceable. But in practice, the cases where that happens are probably pretty obscure (like out-of-bounds references to constant arrays).

I always found it very unfortunate that we keep IR references around in MIR to do alias analysis queries.

I don't really see a good alternative. Assuming we don't just throw away all alias analysis information, I'm not sure how we'd represent it. You could try to explicitly construct alias sets during isel, I guess.

I always found it very unfortunate that we keep IR references around in MIR to do alias analysis queries. This appears to remove a lot of those users (all but the ones in the schedule graph construction?), so I highly welcome the change!

In D128583#3613323, @efriedma wrote:

The general approach here seems fine. I suspect we end up calling pointsToConstantMemory for most memory operands at some point anyway.

In general pointsToConstantMemory doesn't imply the pointer is dereferenceable. But in practice, the cases where that happens are probably pretty obscure (like out-of-bounds references to constant arrays).

Do you think this is worth fixing in a follow on patch? It seems the API requires two separate queries that mostly do the same thing

In D128583#3616215, @arsenm wrote:

In general pointsToConstantMemory doesn't imply the pointer is dereferenceable. But in practice, the cases where that happens are probably pretty obscure (like out-of-bounds references to constant arrays).

Do you think this is worth fixing in a follow on patch? It seems the API requires two separate queries that mostly do the same thing

We probably want the existing semantics of pointsToConstantMemory in most places. In particular, consider a variable array index into a global constant; pointsToConstantMemory, but it's not necessarily dereferenceable.

I think we should just add the call to isDereferenceableAndAlignedPointer. (We could come up with a combined API, but I'm not sure it's worth the extra maintenance work...)

arsenm added a child revision: D130042: CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer.Jul 18 2022, 2:07 PM

8d0383eb694e13a999c9c95adc4b56771429e551

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

GlobalISel/

1 line

7 lines

14 lines

2 lines

6 lines

13 lines

lib/

CodeGen/

CalcSpillWeights.cpp

2 lines

EarlyIfConversion.cpp

2 lines

GlobalISel/

4 lines

77 lines

12 lines

9 lines

23 lines

MLRegallocEvictAdvisor.cpp

8 lines

2 lines

10 lines

17 lines

6 lines

1 line

2 lines

5 lines

4 lines

5 lines

RegisterCoalescer.cpp

2 lines

ScheduleDAGInstrs.cpp

8 lines

SelectionDAG/

SelectionDAG.cpp

22 lines

SelectionDAGBuilder.cpp

31 lines

SplitKit.h

7 lines

SplitKit.cpp

14 lines

TargetInstrInfo.cpp

4 lines

Target/

AMDGPU/

2 lines

7 lines

3 lines

4 lines

ARM/

ARMBaseInstrInfo.h

3 lines

ARMBaseInstrInfo.cpp

4 lines

PowerPC/

PPCInstrInfo.h

3 lines

PPCInstrInfo.cpp

4 lines

WebAssembly/

WebAssemblyInstrInfo.h

3 lines

WebAssemblyInstrInfo.cpp

2 lines

WebAssemblyRegStackify.cpp

23 lines

X86/

X86InstrInfo.h

3 lines

X86InstrInfo.cpp

6 lines

test/

CodeGen/

AArch64/

GlobalISel/

gisel-commandline-option.ll

6 lines

arm64-memcpy-inline.ll

4 lines

AMDGPU/

GlobalISel/

function-returns.ll

32 lines

irtranslator-amdgpu_vs.ll

4 lines

irtranslator-call-non-fixed.ll

4 lines

irtranslator-call.ll

28 lines

irtranslator-invariant.ll

115 lines

amdgcn-load-offset-from-reg.ll

2 lines

llc-pipeline.ll

2 lines

splitkit-getsubrangeformask.ll

70 lines

twoaddr-constrain.ll

4 lines

vgpr-liverange-ir.ll

6 lines

X86/

unfoldMemoryOperand.mir

4 lines

Diff 440198

llvm/include/llvm/CodeGen/GlobalISel/IRTranslator.h

Show First 20 Lines • Show All 563 Lines • ▼ Show 20 Lines	private:
/// Current target configuration. Controls how the pass handles errors.		/// Current target configuration. Controls how the pass handles errors.
const TargetPassConfig *TPC;		const TargetPassConfig *TPC;

CodeGenOpt::Level OptLevel;		CodeGenOpt::Level OptLevel;

/// Current optimization remark emitter. Used to report failures.		/// Current optimization remark emitter. Used to report failures.
std::unique_ptr<OptimizationRemarkEmitter> ORE;		std::unique_ptr<OptimizationRemarkEmitter> ORE;

		AAResults *AA;
FunctionLoweringInfo FuncInfo;		FunctionLoweringInfo FuncInfo;

// True when either the Target Machine specifies no optimizations or the		// True when either the Target Machine specifies no optimizations or the
// function has the optnone attribute.		// function has the optnone attribute.
bool EnableOpts = false;		bool EnableOpts = false;

/// True when the block contains a tail call. This allows the IRTranslator to		/// True when the block contains a tail call. This allows the IRTranslator to
/// stop translating such blocks early.		/// stop translating such blocks early.
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/LiveIntervals.h

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
class raw_ostream;		class raw_ostream;
class TargetInstrInfo;		class TargetInstrInfo;
class VirtRegMap;		class VirtRegMap;

class LiveIntervals : public MachineFunctionPass {		class LiveIntervals : public MachineFunctionPass {
MachineFunction* MF;		MachineFunction* MF;
MachineRegisterInfo* MRI;		MachineRegisterInfo* MRI;
const TargetRegisterInfo* TRI;		const TargetRegisterInfo* TRI;
const TargetInstrInfo* TII;		const TargetInstrInfo *TII;
AAResults *AA;
SlotIndexes* Indexes;		SlotIndexes* Indexes;
MachineDominatorTree *DomTree = nullptr;		MachineDominatorTree *DomTree = nullptr;
LiveIntervalCalc *LICalc = nullptr;		LiveIntervalCalc *LICalc = nullptr;

/// Special pool allocator for VNInfo's (LiveInterval val#).		/// Special pool allocator for VNInfo's (LiveInterval val#).
VNInfo::Allocator VNInfoAllocator;		VNInfo::Allocator VNInfoAllocator;

/// Live interval pointers for all the virtual registers.		/// Live interval pointers for all the virtual registers.
▲ Show 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	LLVM_ATTRIBUTE_UNUSED void pruneValue(LiveInterval &, SlotIndex,
llvm_unreachable(		llvm_unreachable(
"Use pruneValue on the main LiveRange and on each subrange");		"Use pruneValue on the main LiveRange and on each subrange");
}		}

SlotIndexes *getSlotIndexes() const {		SlotIndexes *getSlotIndexes() const {
return Indexes;		return Indexes;
}		}

AAResults *getAliasAnalysis() const {
return AA;
}

/// Returns true if the specified machine instr has been removed or was		/// Returns true if the specified machine instr has been removed or was
/// never entered in the map.		/// never entered in the map.
bool isNotInMIMap(const MachineInstr &Instr) const {		bool isNotInMIMap(const MachineInstr &Instr) const {
return !Indexes->hasIndex(Instr);		return !Indexes->hasIndex(Instr);
}		}

/// Returns the base index of the given instruction.		/// Returns the base index of the given instruction.
SlotIndex getInstructionIndex(const MachineInstr &Instr) const {		SlotIndex getInstructionIndex(const MachineInstr &Instr) const {
▲ Show 20 Lines • Show All 268 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/LiveRangeEdit.h

Show All 26 Lines
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/SlotIndexes.h"		#include "llvm/CodeGen/SlotIndexes.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"		#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include <cassert>		#include <cassert>

namespace llvm {		namespace llvm {

class AAResults;
class LiveIntervals;		class LiveIntervals;
class MachineInstr;		class MachineInstr;
class MachineOperand;		class MachineOperand;
class TargetInstrInfo;		class TargetInstrInfo;
class TargetRegisterInfo;		class TargetRegisterInfo;
class VirtRegMap;		class VirtRegMap;
class VirtRegAuxInfo;		class VirtRegAuxInfo;

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	private:
/// tii.isTriviallyReMaterializable().		/// tii.isTriviallyReMaterializable().
SmallPtrSet<const VNInfo *, 4> Remattable;		SmallPtrSet<const VNInfo *, 4> Remattable;

/// Rematted - Values that were actually rematted, and so need to have their		/// Rematted - Values that were actually rematted, and so need to have their
/// live range trimmed or entirely removed.		/// live range trimmed or entirely removed.
SmallPtrSet<const VNInfo *, 4> Rematted;		SmallPtrSet<const VNInfo *, 4> Rematted;

/// scanRemattable - Identify the Parent values that may rematerialize.		/// scanRemattable - Identify the Parent values that may rematerialize.
void scanRemattable(AAResults *aa);		void scanRemattable();

/// foldAsLoad - If LI has a single use and a single def that can be folded as		/// foldAsLoad - If LI has a single use and a single def that can be folded as
/// a load, eliminate the register by folding the def into the use.		/// a load, eliminate the register by folding the def into the use.
bool foldAsLoad(LiveInterval LI, SmallVectorImpl<MachineInstr > &Dead);		bool foldAsLoad(LiveInterval LI, SmallVectorImpl<MachineInstr > &Dead);

using ToShrinkSet = SetVector<LiveInterval , SmallVector<LiveInterval , 8>,		using ToShrinkSet = SetVector<LiveInterval , SmallVector<LiveInterval , 8>,
SmallPtrSet<LiveInterval *, 8>>;		SmallPtrSet<LiveInterval *, 8>>;

/// Helper for eliminateDeadDefs.		/// Helper for eliminateDeadDefs.
void eliminateDeadDef(MachineInstr *MI, ToShrinkSet &ToShrink,		void eliminateDeadDef(MachineInstr *MI, ToShrinkSet &ToShrink);
AAResults *AA);

/// MachineRegisterInfo callback to notify when new virtual		/// MachineRegisterInfo callback to notify when new virtual
/// registers are created.		/// registers are created.
void MRI_NoteNewVirtualRegister(Register VReg) override;		void MRI_NoteNewVirtualRegister(Register VReg) override;

/// Check if MachineOperand \p MO is a last use/kill either in the		/// Check if MachineOperand \p MO is a last use/kill either in the
/// main live range of \p LI or in one of the matching subregister ranges.		/// main live range of \p LI or in one of the matching subregister ranges.
bool useIsKill(const LiveInterval &LI, const MachineOperand &MO) const;		bool useIsKill(const LiveInterval &LI, const MachineOperand &MO) const;
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	LiveInterval &createEmptyInterval() {
return createEmptyIntervalFrom(getReg(), true);		return createEmptyIntervalFrom(getReg(), true);
}		}

Register create() { return createFrom(getReg()); }		Register create() { return createFrom(getReg()); }

/// anyRematerializable - Return true if any parent values may be		/// anyRematerializable - Return true if any parent values may be
/// rematerializable.		/// rematerializable.
/// This function must be called before any rematerialization is attempted.		/// This function must be called before any rematerialization is attempted.
bool anyRematerializable(AAResults *);		bool anyRematerializable();

/// checkRematerializable - Manually add VNI to the list of rematerializable		/// checkRematerializable - Manually add VNI to the list of rematerializable
/// values if DefMI may be rematerializable.		/// values if DefMI may be rematerializable.
bool checkRematerializable(VNInfo VNI, const MachineInstr DefMI,		bool checkRematerializable(VNInfo VNI, const MachineInstr DefMI);
AAResults *);

/// Remat - Information needed to rematerialize at a specific location.		/// Remat - Information needed to rematerialize at a specific location.
struct Remat {		struct Remat {
const VNInfo *const ParentVNI; // parent_'s value at the remat location.		const VNInfo *const ParentVNI; // parent_'s value at the remat location.
MachineInstr *OrigMI = nullptr; // Instruction defining OrigVNI. It contains		MachineInstr *OrigMI = nullptr; // Instruction defining OrigVNI. It contains
// the real expr for remat.		// the real expr for remat.

explicit Remat(const VNInfo *ParentVNI) : ParentVNI(ParentVNI) {}		explicit Remat(const VNInfo *ParentVNI) : ParentVNI(ParentVNI) {}
Show All 36 Lines	public:

/// eliminateDeadDefs - Try to delete machine instructions that are now dead		/// eliminateDeadDefs - Try to delete machine instructions that are now dead
/// (allDefsAreDead returns true). This may cause live intervals to be trimmed		/// (allDefsAreDead returns true). This may cause live intervals to be trimmed
/// and further dead efs to be eliminated.		/// and further dead efs to be eliminated.
/// RegsBeingSpilled lists registers currently being spilled by the register		/// RegsBeingSpilled lists registers currently being spilled by the register
/// allocator. These registers should not be split into new intervals		/// allocator. These registers should not be split into new intervals
/// as currently those new intervals are not guaranteed to spill.		/// as currently those new intervals are not guaranteed to spill.
void eliminateDeadDefs(SmallVectorImpl<MachineInstr *> &Dead,		void eliminateDeadDefs(SmallVectorImpl<MachineInstr *> &Dead,
ArrayRef<Register> RegsBeingSpilled = None,		ArrayRef<Register> RegsBeingSpilled = None);
AAResults *AA = nullptr);

/// calculateRegClassAndHint - Recompute register class and hint for each new		/// calculateRegClassAndHint - Recompute register class and hint for each new
/// register.		/// register.
void calculateRegClassAndHint(MachineFunction &, VirtRegAuxInfo &);		void calculateRegClassAndHint(MachineFunction &, VirtRegAuxInfo &);
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_CODEGEN_LIVERANGEEDIT_H		#endif // LLVM_CODEGEN_LIVERANGEEDIT_H

llvm/include/llvm/CodeGen/MachineInstr.h

	Show First 20 Lines • Show All 1,614 Lines • ▼ Show 20 Lines

	/// Return true if this load instruction never traps and points to a memory			/// Return true if this load instruction never traps and points to a memory
	/// location whose value doesn't change during the execution of this function.			/// location whose value doesn't change during the execution of this function.
	///			///
	/// Examples include loading a value from the constant pool or from the			/// Examples include loading a value from the constant pool or from the
	/// argument area of a function (if it does not change). If the instruction			/// argument area of a function (if it does not change). If the instruction
	/// does multiple loads, this returns true only if all of the loads are			/// does multiple loads, this returns true only if all of the loads are
	/// dereferenceable and invariant.			/// dereferenceable and invariant.
	bool isDereferenceableInvariantLoad(AAResults *AA) const;			bool isDereferenceableInvariantLoad() const;

	/// If the specified instruction is a PHI that always merges together the			/// If the specified instruction is a PHI that always merges together the
	/// same virtual register, return the register, otherwise return 0.			/// same virtual register, return the register, otherwise return 0.
	unsigned isConstantValuePHI() const;			unsigned isConstantValuePHI() const;

	/// Return true if this instruction has side effects that are not modeled			/// Return true if this instruction has side effects that are not modeled
	/// by mayLoad / mayStore, etc.			/// by mayLoad / mayStore, etc.
	/// For all instructions, the property is encoded in MCInstrDesc::Flags			/// For all instructions, the property is encoded in MCInstrDesc::Flags
	▲ Show 20 Lines • Show All 276 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/SelectionDAG.h

Show First 20 Lines • Show All 1,037 Lines • ▼ Show 20 Lines	#endif
/// stack arguments from being clobbered.		/// stack arguments from being clobbered.
SDValue getStackArgumentTokenFactor(SDValue Chain);		SDValue getStackArgumentTokenFactor(SDValue Chain);

SDValue getMemcpy(SDValue Chain, const SDLoc &dl, SDValue Dst, SDValue Src,		SDValue getMemcpy(SDValue Chain, const SDLoc &dl, SDValue Dst, SDValue Src,
SDValue Size, Align Alignment, bool isVol,		SDValue Size, Align Alignment, bool isVol,
bool AlwaysInline, bool isTailCall,		bool AlwaysInline, bool isTailCall,
MachinePointerInfo DstPtrInfo,		MachinePointerInfo DstPtrInfo,
MachinePointerInfo SrcPtrInfo,		MachinePointerInfo SrcPtrInfo,
const AAMDNodes &AAInfo = AAMDNodes());		const AAMDNodes &AAInfo = AAMDNodes(),
		AAResults *AA = nullptr);

SDValue getMemmove(SDValue Chain, const SDLoc &dl, SDValue Dst, SDValue Src,		SDValue getMemmove(SDValue Chain, const SDLoc &dl, SDValue Dst, SDValue Src,
SDValue Size, Align Alignment, bool isVol, bool isTailCall,		SDValue Size, Align Alignment, bool isVol, bool isTailCall,
MachinePointerInfo DstPtrInfo,		MachinePointerInfo DstPtrInfo,
MachinePointerInfo SrcPtrInfo,		MachinePointerInfo SrcPtrInfo,
const AAMDNodes &AAInfo = AAMDNodes());		const AAMDNodes &AAInfo = AAMDNodes(),
		AAResults *AA = nullptr);

SDValue getMemset(SDValue Chain, const SDLoc &dl, SDValue Dst, SDValue Src,		SDValue getMemset(SDValue Chain, const SDLoc &dl, SDValue Dst, SDValue Src,
SDValue Size, Align Alignment, bool isVol,		SDValue Size, Align Alignment, bool isVol,
bool AlwaysInline, bool isTailCall,		bool AlwaysInline, bool isTailCall,
MachinePointerInfo DstPtrInfo,		MachinePointerInfo DstPtrInfo,
const AAMDNodes &AAInfo = AAMDNodes());		const AAMDNodes &AAInfo = AAMDNodes());

SDValue getAtomicMemcpy(SDValue Chain, const SDLoc &dl, SDValue Dst,		SDValue getAtomicMemcpy(SDValue Chain, const SDLoc &dl, SDValue Dst,
▲ Show 20 Lines • Show All 1,189 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/TargetInstrInfo.h

Show First 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	const TargetRegisterClass *getRegClass(const MCInstrDesc &MCID, unsigned OpNum,
const TargetRegisterInfo *TRI,		const TargetRegisterInfo *TRI,
const MachineFunction &MF) const;		const MachineFunction &MF) const;

/// Return true if the instruction is trivially rematerializable, meaning it		/// Return true if the instruction is trivially rematerializable, meaning it
/// has no side effects and requires no operands that aren't always available.		/// has no side effects and requires no operands that aren't always available.
/// This means the only allowed uses are constants and unallocatable physical		/// This means the only allowed uses are constants and unallocatable physical
/// registers so that the instructions result is independent of the place		/// registers so that the instructions result is independent of the place
/// in the function.		/// in the function.
bool isTriviallyReMaterializable(const MachineInstr &MI,		bool isTriviallyReMaterializable(const MachineInstr &MI) const {
AAResults *AA = nullptr) const {
return MI.getOpcode() == TargetOpcode::IMPLICIT_DEF \|\|		return MI.getOpcode() == TargetOpcode::IMPLICIT_DEF \|\|
(MI.getDesc().isRematerializable() &&		(MI.getDesc().isRematerializable() &&
(isReallyTriviallyReMaterializable(MI, AA) \|\|		(isReallyTriviallyReMaterializable(MI) \|\|
isReallyTriviallyReMaterializableGeneric(MI, AA)));		isReallyTriviallyReMaterializableGeneric(MI)));
}		}

/// Given \p MO is a PhysReg use return if it can be ignored for the purpose		/// Given \p MO is a PhysReg use return if it can be ignored for the purpose
/// of instruction rematerialization or sinking.		/// of instruction rematerialization or sinking.
virtual bool isIgnorableUse(const MachineOperand &MO) const {		virtual bool isIgnorableUse(const MachineOperand &MO) const {
return false;		return false;
}		}

protected:		protected:
/// For instructions with opcodes for which the M_REMATERIALIZABLE flag is		/// For instructions with opcodes for which the M_REMATERIALIZABLE flag is
/// set, this hook lets the target specify whether the instruction is actually		/// set, this hook lets the target specify whether the instruction is actually
/// trivially rematerializable, taking into consideration its operands. This		/// trivially rematerializable, taking into consideration its operands. This
/// predicate must return false if the instruction has any side effects other		/// predicate must return false if the instruction has any side effects other
/// than producing a value, or if it requres any address registers that are		/// than producing a value, or if it requres any address registers that are
/// not always available.		/// not always available.
/// Requirements must be check as stated in isTriviallyReMaterializable() .		/// Requirements must be check as stated in isTriviallyReMaterializable() .
virtual bool isReallyTriviallyReMaterializable(const MachineInstr &MI,		virtual bool isReallyTriviallyReMaterializable(const MachineInstr &MI) const {
AAResults *AA) const {
return false;		return false;
}		}

/// This method commutes the operands of the given machine instruction MI.		/// This method commutes the operands of the given machine instruction MI.
/// The operands to be commuted are specified by their indices OpIdx1 and		/// The operands to be commuted are specified by their indices OpIdx1 and
/// OpIdx2.		/// OpIdx2.
///		///
/// If a target has any instructions that are commutable but require		/// If a target has any instructions that are commutable but require
Show All 25 Lines	static bool fixCommutedOpIndices(unsigned &ResultIdx1, unsigned &ResultIdx2,
unsigned CommutableOpIdx1,		unsigned CommutableOpIdx1,
unsigned CommutableOpIdx2);		unsigned CommutableOpIdx2);

private:		private:
/// For instructions with opcodes for which the M_REMATERIALIZABLE flag is		/// For instructions with opcodes for which the M_REMATERIALIZABLE flag is
/// set and the target hook isReallyTriviallyReMaterializable returns false,		/// set and the target hook isReallyTriviallyReMaterializable returns false,
/// this function does target-independent tests to determine if the		/// this function does target-independent tests to determine if the
/// instruction is really trivially rematerializable.		/// instruction is really trivially rematerializable.
bool isReallyTriviallyReMaterializableGeneric(const MachineInstr &MI,		bool isReallyTriviallyReMaterializableGeneric(const MachineInstr &MI) const;
AAResults *AA) const;

public:		public:
/// These methods return the opcode of the frame setup/destroy instructions		/// These methods return the opcode of the frame setup/destroy instructions
/// if they exist (-1 otherwise). Some targets use pseudo instructions in		/// if they exist (-1 otherwise). Some targets use pseudo instructions in
/// order to abstract away the difference between operating with a frame		/// order to abstract away the difference between operating with a frame
/// pointer and operating without, through the use of these two instructions.		/// pointer and operating without, through the use of these two instructions.
///		///
unsigned getCallFrameSetupOpcode() const { return CallFrameSetupOpcode; }		unsigned getCallFrameSetupOpcode() const { return CallFrameSetupOpcode; }
▲ Show 20 Lines • Show All 1,846 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CalcSpillWeights.cpp

Show First 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	while (MI->isFullCopy()) {
VNI = SrcQ.valueIn();		VNI = SrcQ.valueIn();
assert(VNI && "Copy from non-existing value");		assert(VNI && "Copy from non-existing value");
if (VNI->isPHIDef())		if (VNI->isPHIDef())
return false;		return false;
MI = LIS.getInstructionFromIndex(VNI->def);		MI = LIS.getInstructionFromIndex(VNI->def);
assert(MI && "Dead valno in interval");		assert(MI && "Dead valno in interval");
}		}

if (!TII.isTriviallyReMaterializable(*MI, LIS.getAliasAnalysis()))		if (!TII.isTriviallyReMaterializable(*MI))
return false;		return false;
}		}
return true;		return true;
}		}

bool VirtRegAuxInfo::isLiveAtStatepointVarArg(LiveInterval &LI) {		bool VirtRegAuxInfo::isLiveAtStatepointVarArg(LiveInterval &LI) {
return any_of(VRM.getRegInfo().reg_operands(LI.reg()),		return any_of(VRM.getRegInfo().reg_operands(LI.reg()),
[](MachineOperand &MO) {		[](MachineOperand &MO) {
▲ Show 20 Lines • Show All 191 Lines • Show Last 20 Lines

llvm/lib/CodeGen/EarlyIfConversion.cpp

Show First 20 Lines • Show All 570 Lines • ▼ Show 20 Lines	if (!TDef \|\| !FDef)
return false;		return false;

// If there are side-effects, all bets are off.		// If there are side-effects, all bets are off.
if (TDef->hasUnmodeledSideEffects())		if (TDef->hasUnmodeledSideEffects())
return false;		return false;

// If the instruction could modify memory, or there may be some intervening		// If the instruction could modify memory, or there may be some intervening
// store between the two, we can't consider them to be equal.		// store between the two, we can't consider them to be equal.
if (TDef->mayLoadOrStore() && !TDef->isDereferenceableInvariantLoad(nullptr))		if (TDef->mayLoadOrStore() && !TDef->isDereferenceableInvariantLoad())
return false;		return false;

// We also can't guarantee that they are the same if, for example, the		// We also can't guarantee that they are the same if, for example, the
// instructions are both a copy from a physical reg, because some other		// instructions are both a copy from a physical reg, because some other
// instruction may have modified the value in that reg between the two		// instruction may have modified the value in that reg between the two
// defining insts.		// defining insts.
if (any_of(TDef->uses(), [](const MachineOperand &MO) {		if (any_of(TDef->uses(), [](const MachineOperand &MO) {
return MO.isReg() && MO.getReg().isPhysical();		return MO.isReg() && MO.getReg().isPhysical();
▲ Show 20 Lines • Show All 619 Lines • Show Last 20 Lines

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp

Show First 20 Lines • Show All 2,357 Lines • ▼ Show 20 Lines	bool CombinerHelper::matchEqualDefs(const MachineOperand &MOP1,
// %x2 = G_LOAD %addr (load N from @somewhere)		// %x2 = G_LOAD %addr (load N from @somewhere)
// ...		// ...
// %or = G_OR %x1, %x2		// %or = G_OR %x1, %x2
//		//
// It's possible that @foo will modify whatever lives at the address we're		// It's possible that @foo will modify whatever lives at the address we're
// loading from. To be safe, let's just assume that all loads and stores		// loading from. To be safe, let's just assume that all loads and stores
// are different (unless we have something which is guaranteed to not		// are different (unless we have something which is guaranteed to not
// change.)		// change.)
if (I1->mayLoadOrStore() && !I1->isDereferenceableInvariantLoad(nullptr))		if (I1->mayLoadOrStore() && !I1->isDereferenceableInvariantLoad())
return false;		return false;

// If both instructions are loads or stores, they are equal only if both		// If both instructions are loads or stores, they are equal only if both
// are dereferenceable invariant loads with the same number of bits.		// are dereferenceable invariant loads with the same number of bits.
if (I1->mayLoadOrStore() && I2->mayLoadOrStore()) {		if (I1->mayLoadOrStore() && I2->mayLoadOrStore()) {
GLoadStore *LS1 = dyn_cast<GLoadStore>(I1);		GLoadStore *LS1 = dyn_cast<GLoadStore>(I1);
GLoadStore *LS2 = dyn_cast<GLoadStore>(I2);		GLoadStore *LS2 = dyn_cast<GLoadStore>(I2);
if (!LS1 \|\| !LS2)		if (!LS1 \|\| !LS2)
return false;		return false;

if (!I2->isDereferenceableInvariantLoad(nullptr) \|\|		if (!I2->isDereferenceableInvariantLoad() \|\|
(LS1->getMemSizeInBits() != LS2->getMemSizeInBits()))		(LS1->getMemSizeInBits() != LS2->getMemSizeInBits()))
return false;		return false;
}		}

// Check for physical registers on the instructions first to avoid cases		// Check for physical registers on the instructions first to avoid cases
// like this:		// like this:
//		//
// %a = COPY $physreg		// %a = COPY $physreg
▲ Show 20 Lines • Show All 3,275 Lines • Show Last 20 Lines

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp

Show All 9 Lines

//===----------------------------------------------------------------------===//

#include "llvm/CodeGen/GlobalISel/IRTranslator.h"

#include "llvm/ADT/PostOrderIterator.h"

#include "llvm/ADT/STLExtras.h"

#include "llvm/ADT/ScopeExit.h"

#include "llvm/ADT/SmallSet.h"

#include "llvm/ADT/SmallVector.h"

#include "llvm/Analysis/AliasAnalysis.h"

#include "llvm/Analysis/BranchProbabilityInfo.h"

#include "llvm/Analysis/OptimizationRemarkEmitter.h"

#include "llvm/Analysis/ValueTracking.h"

#include "llvm/CodeGen/Analysis.h"

#include "llvm/CodeGen/GlobalISel/CSEInfo.h"

#include "llvm/CodeGen/GlobalISel/CSEMIRBuilder.h"

#include "llvm/CodeGen/GlobalISel/CallLowering.h"

#include "llvm/CodeGen/GlobalISel/GISelChangeObserver.h"

▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines

} // namespace

#endif // ifndef NDEBUG

void IRTranslator::getAnalysisUsage(AnalysisUsage &AU) const {

AU.addRequired<StackProtector>();

AU.addRequired<TargetPassConfig>();

AU.addRequired<GISelCSEAnalysisWrapperPass>();

if (OptLevel != CodeGenOpt::None)

if (OptLevel != CodeGenOpt::None) {

AU.addRequired<BranchProbabilityInfoWrapperPass>();

AU.addRequired<AAResultsWrapperPass>();

}

AU.addRequired<TargetLibraryInfoWrapperPass>();

AU.addPreserved<TargetLibraryInfoWrapperPass>();

getSelectionDAGFallbackAnalysisUsage(AU);

MachineFunctionPass::getAnalysisUsage(AU);

}

IRTranslator::ValueToVRegInfo::VRegListT &

IRTranslator::allocateVRegs(const Value &Val) {

▲ Show 20 Lines • Show All 1,091 Lines • ▼ Show 20 Lines

if (auto Arg = dyn_cast<Argument>(V))

return Arg->hasSwiftErrorAttr();

if (auto AI = dyn_cast<AllocaInst>(V))

return AI->isSwiftError();

return false;

}

bool IRTranslator::translateLoad(const User &U, MachineIRBuilder &MIRBuilder) {

const LoadInst &LI = cast<LoadInst>(U);

if (DL->getTypeStoreSize(LI.getType()) == 0)

unsigned StoreSize = DL->getTypeStoreSize(LI.getType());

if (StoreSize == 0)

return true;

ArrayRef<Register> Regs = getOrCreateVRegs(LI);

ArrayRef<uint64_t> Offsets = *VMap.getOffsets(LI);

AAMDNodes AAInfo = LI.getAAMetadata();

Type *OffsetIRTy = DL->getIntPtrType(LI.getPointerOperandType());

const Value *Ptr = LI.getPointerOperand();

Type *OffsetIRTy = DL->getIntPtrType(Ptr->getType());

LLT OffsetTy = getLLTForType(*OffsetIRTy, *DL);

if (CLI->supportSwiftError() && isSwiftError(LI.getPointerOperand())) {

if (CLI->supportSwiftError() && isSwiftError(Ptr)) {

assert(Regs.size() == 1 && "swifterror should be single pointer");

LI.getPointerOperand());

SwiftError.getOrCreateVRegUseAt(&LI, &MIRBuilder.getMBB(), Ptr);

MIRBuilder.buildCopy(Regs[0], VReg);

return true;

}

auto &TLI = *MF->getSubtarget().getTargetLowering();

MachineMemOperand::Flags Flags = TLI.getLoadMemOperandFlags(LI, *DL);

if (AA && !(Flags & MachineMemOperand::MOInvariant)) {

if (AA->pointsToConstantMemory(

MemoryLocation(Ptr, LocationSize::precise(StoreSize), AAInfo))) {

Flags |= MachineMemOperand::MOInvariant;

// FIXME: pointsToConstantMemory probably does not imply dereferenceable,

// but the previous usage implied it did. Probably should check

// isDereferenceableAndAlignedPointer.

Flags |= MachineMemOperand::MODereferenceable;

}

const MDNode *Ranges =

Regs.size() == 1 ? LI.getMetadata(LLVMContext::MD_range) : nullptr;

for (unsigned i = 0; i < Regs.size(); ++i) {

MIRBuilder.materializePtrAdd(Addr, Base, OffsetTy, Offsets[i] / 8);

MachinePointerInfo Ptr(LI.getPointerOperand(), Offsets[i] / 8);

Align BaseAlign = getMemOpAlign(LI);

auto MMO = MF->getMachineMemOperand(

Ptr, Flags, MRI->getType(Regs[i]),

commonAlignment(BaseAlign, Offsets[i] / 8), LI.getAAMetadata(), Ranges,

commonAlignment(BaseAlign, Offsets[i] / 8), AAInfo, Ranges,

LI.getSyncScopeID(), LI.getOrdering());

MIRBuilder.buildLoad(Regs[i], Addr, *MMO);

}

return true;

}

bool IRTranslator::translateStore(const User &U, MachineIRBuilder &MIRBuilder) {

▲ Show 20 Lines • Show All 240 Lines • ▼ Show 20 Lines

bool IRTranslator::translateGetElementPtr(const User &U,

MIRBuilder.buildCopy(getOrCreateVReg(U), BaseReg);

return true;

}

bool IRTranslator::translateMemFunc(const CallInst &CI,

MachineIRBuilder &MIRBuilder,

unsigned Opcode) {

const Value *SrcPtr = CI.getArgOperand(1);

// If the source is undef, then just emit a nop.

if (isa<UndefValue>(CI.getArgOperand(1)))

if (isa<UndefValue>(SrcPtr))

return true;

SmallVector<Register, 3> SrcRegs;

unsigned MinPtrSize = UINT_MAX;

for (auto AI = CI.arg_begin(), AE = CI.arg_end(); std::next(AI) != AE; ++AI) {

LLT SrcTy = MRI->getType(SrcReg);

Show All 13 Lines

bool IRTranslator::translateMemFunc(const CallInst &CI,

for (Register SrcReg : SrcRegs)

ICall.addUse(SrcReg);

Align DstAlign;

Align SrcAlign;

unsigned IsVol =

cast<ConstantInt>(CI.getArgOperand(CI.arg_size() - 1))->getZExtValue();

ConstantInt *CopySize = nullptr;

if (auto *MCI = dyn_cast<MemCpyInst>(&CI)) {

DstAlign = MCI->getDestAlign().valueOrOne();

SrcAlign = MCI->getSourceAlign().valueOrOne();

CopySize = dyn_cast<ConstantInt>(MCI->getArgOperand(2));

} else if (auto *MCI = dyn_cast<MemCpyInlineInst>(&CI)) {

DstAlign = MCI->getDestAlign().valueOrOne();

SrcAlign = MCI->getSourceAlign().valueOrOne();

CopySize = dyn_cast<ConstantInt>(MCI->getArgOperand(2));

} else if (auto *MMI = dyn_cast<MemMoveInst>(&CI)) {

DstAlign = MMI->getDestAlign().valueOrOne();

SrcAlign = MMI->getSourceAlign().valueOrOne();

CopySize = dyn_cast<ConstantInt>(MMI->getArgOperand(2));

} else {

auto *MSI = cast<MemSetInst>(&CI);

DstAlign = MSI->getDestAlign().valueOrOne();

}

MatzeBUnsubmitted

Not Done

cast<ConstantInt>(CI.getArgOperand(CI.arg_size() - 1))->getZExtValue();

- ConstantInt *CopySize = nullptr;

+ ConstantInt *CopySize;

if (auto *MCI = dyn_cast<MemCpyInst>(&CI)) {

DstAlign = MCI->getDestAlign().valueOrOne();

SrcAlign = MCI->getSourceAlign().valueOrOne();

CopySize = dyn_cast<ConstantInt>(MCI->getArgOperand(2));

} else if (auto *MCI = dyn_cast<MemCpyInlineInst>(&CI)) {

DstAlign = MCI->getDestAlign().valueOrOne();

SrcAlign = MCI->getSourceAlign().valueOrOne();

CopySize = dyn_cast<ConstantInt>(MCI->getArgOperand(2));

} else if (auto *MMI = dyn_cast<MemMoveInst>(&CI)) {

DstAlign = MMI->getDestAlign().valueOrOne();

SrcAlign = MMI->getSourceAlign().valueOrOne();

CopySize = dyn_cast<ConstantInt>(MMI->getArgOperand(2));

} else {

auto *MSI = cast<MemSetInst>(&CI);

DstAlign = MSI->getDestAlign().valueOrOne();

+ CopySize = nullptr;

}

if (Opcode != TargetOpcode::G_MEMCPY_INLINE) {

Nit: I like initializing things in every branch without a default, so you get warnings if you forget to set the value in one of the branches...

MatzeB: Nit: I like initializing things in every branch without a default, so you get warnings if you…

if (Opcode != TargetOpcode::G_MEMCPY_INLINE) {

// We need to propagate the tail call flag from the IR inst as an argument.

// Otherwise, we have to pessimize and assume later that we cannot tail call

// any memory intrinsics.

ICall.addImm(CI.isTailCall() ? 1 : 0);

}

// Create mem operands to store the alignment and volatile info.

auto VolFlag = IsVol ? MachineMemOperand::MOVolatile : MachineMemOperand::MONone;

MachineMemOperand::Flags LoadFlags = MachineMemOperand::MOLoad;

ICall.addMemOperand(MF->getMachineMemOperand(

MachineMemOperand::Flags StoreFlags = MachineMemOperand::MOStore;

MachinePointerInfo(CI.getArgOperand(0)),

if (IsVol) {

MachineMemOperand::MOStore | VolFlag, 1, DstAlign));

LoadFlags |= MachineMemOperand::MOVolatile;

StoreFlags |= MachineMemOperand::MOVolatile;

}

AAMDNodes AAInfo = CI.getAAMetadata();

if (AA && CopySize &&

AA->pointsToConstantMemory(MemoryLocation(

SrcPtr, LocationSize::precise(CopySize->getZExtValue()), AAInfo))) {

LoadFlags |= MachineMemOperand::MOInvariant;

// FIXME: pointsToConstantMemory probably does not imply dereferenceable,

// but the previous usage implied it did. Probably should check

// isDereferenceableAndAlignedPointer.

LoadFlags |= MachineMemOperand::MODereferenceable;

}

ICall.addMemOperand(

MF->getMachineMemOperand(MachinePointerInfo(CI.getArgOperand(0)),

StoreFlags, 1, DstAlign, AAInfo));

if (Opcode != TargetOpcode::G_MEMSET)

ICall.addMemOperand(MF->getMachineMemOperand(

MachinePointerInfo(CI.getArgOperand(1)),

MachinePointerInfo(SrcPtr), LoadFlags, 1, SrcAlign, AAInfo));

MachineMemOperand::MOLoad | VolFlag, 1, SrcAlign));

return true;

}

void IRTranslator::getStackGuard(Register DstReg,

MachineIRBuilder &MIRBuilder) {

const TargetRegisterInfo *TRI = MF->getSubtarget().getRegisterInfo();

MRI->setRegClass(DstReg, TRI->getPointerRegClass(*MF));

▲ Show 20 Lines • Show All 1,695 Lines • ▼ Show 20 Lines

bool IRTranslator::runOnMachineFunction(MachineFunction &CurMF) {

EntryBuilder->setMF(*MF);

MRI = &MF->getRegInfo();

DL = &F.getParent()->getDataLayout();

ORE = std::make_unique<OptimizationRemarkEmitter>(&F);

const TargetMachine &TM = MF->getTarget();

TM.resetTargetOptions(F);

EnableOpts = OptLevel != CodeGenOpt::None && !skipFunction(F);

FuncInfo.MF = MF;

if (EnableOpts)

if (EnableOpts) {

AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();

FuncInfo.BPI = &getAnalysis<BranchProbabilityInfoWrapperPass>().getBPI();

else

} else {

AA = nullptr;

FuncInfo.BPI = nullptr;

}

FuncInfo.CanLowerReturn = CLI->checkReturnTypeForCallConv(*MF);

const auto &TLI = *MF->getSubtarget().getTargetLowering();

SL = std::make_unique<GISelSwitchLowering>(this, FuncInfo);

SL->init(TLI, TM, *DL);

▲ Show 20 Lines • Show All 176 Lines • Show Last 20 Lines

llvm/lib/CodeGen/InlineSpiller.cpp

Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	RestrictStatepointRemat("restrict-statepoint-remat",
cl::desc("Restrict remat for statepoint operands"));		cl::desc("Restrict remat for statepoint operands"));

namespace {		namespace {

class HoistSpillHelper : private LiveRangeEdit::Delegate {		class HoistSpillHelper : private LiveRangeEdit::Delegate {
MachineFunction &MF;		MachineFunction &MF;
LiveIntervals &LIS;		LiveIntervals &LIS;
LiveStacks &LSS;		LiveStacks &LSS;
AliasAnalysis *AA;
MachineDominatorTree &MDT;		MachineDominatorTree &MDT;
MachineLoopInfo &Loops;		MachineLoopInfo &Loops;
VirtRegMap &VRM;		VirtRegMap &VRM;
MachineRegisterInfo &MRI;		MachineRegisterInfo &MRI;
const TargetInstrInfo &TII;		const TargetInstrInfo &TII;
const TargetRegisterInfo &TRI;		const TargetRegisterInfo &TRI;
const MachineBlockFrequencyInfo &MBFI;		const MachineBlockFrequencyInfo &MBFI;

Show All 37 Lines	void runHoistSpills(LiveInterval &OrigLI, VNInfo &OrigVNI,
SmallVectorImpl<MachineInstr *> &SpillsToRm,		SmallVectorImpl<MachineInstr *> &SpillsToRm,
DenseMap<MachineBasicBlock *, unsigned> &SpillsToIns);		DenseMap<MachineBasicBlock *, unsigned> &SpillsToIns);

public:		public:
HoistSpillHelper(MachineFunctionPass &pass, MachineFunction &mf,		HoistSpillHelper(MachineFunctionPass &pass, MachineFunction &mf,
VirtRegMap &vrm)		VirtRegMap &vrm)
: MF(mf), LIS(pass.getAnalysis<LiveIntervals>()),		: MF(mf), LIS(pass.getAnalysis<LiveIntervals>()),
LSS(pass.getAnalysis<LiveStacks>()),		LSS(pass.getAnalysis<LiveStacks>()),
AA(&pass.getAnalysis<AAResultsWrapperPass>().getAAResults()),
MDT(pass.getAnalysis<MachineDominatorTree>()),		MDT(pass.getAnalysis<MachineDominatorTree>()),
Loops(pass.getAnalysis<MachineLoopInfo>()), VRM(vrm),		Loops(pass.getAnalysis<MachineLoopInfo>()), VRM(vrm),
MRI(mf.getRegInfo()), TII(*mf.getSubtarget().getInstrInfo()),		MRI(mf.getRegInfo()), TII(*mf.getSubtarget().getInstrInfo()),
TRI(*mf.getSubtarget().getRegisterInfo()),		TRI(*mf.getSubtarget().getRegisterInfo()),
MBFI(pass.getAnalysis<MachineBlockFrequencyInfo>()),		MBFI(pass.getAnalysis<MachineBlockFrequencyInfo>()),
IPA(LIS, mf.getNumBlockIDs()) {}		IPA(LIS, mf.getNumBlockIDs()) {}

void addToMergeableSpills(MachineInstr &Spill, int StackSlot,		void addToMergeableSpills(MachineInstr &Spill, int StackSlot,
unsigned Original);		unsigned Original);
bool rmFromMergeableSpills(MachineInstr &Spill, int StackSlot);		bool rmFromMergeableSpills(MachineInstr &Spill, int StackSlot);
void hoistAllSpills();		void hoistAllSpills();
void LRE_DidCloneVirtReg(Register, Register) override;		void LRE_DidCloneVirtReg(Register, Register) override;
};		};

class InlineSpiller : public Spiller {		class InlineSpiller : public Spiller {
MachineFunction &MF;		MachineFunction &MF;
LiveIntervals &LIS;		LiveIntervals &LIS;
LiveStacks &LSS;		LiveStacks &LSS;
AliasAnalysis *AA;
MachineDominatorTree &MDT;		MachineDominatorTree &MDT;
MachineLoopInfo &Loops;		MachineLoopInfo &Loops;
VirtRegMap &VRM;		VirtRegMap &VRM;
MachineRegisterInfo &MRI;		MachineRegisterInfo &MRI;
const TargetInstrInfo &TII;		const TargetInstrInfo &TII;
const TargetRegisterInfo &TRI;		const TargetRegisterInfo &TRI;
const MachineBlockFrequencyInfo &MBFI;		const MachineBlockFrequencyInfo &MBFI;

Show All 24 Lines	class InlineSpiller : public Spiller {

~InlineSpiller() override = default;		~InlineSpiller() override = default;

public:		public:
InlineSpiller(MachineFunctionPass &Pass, MachineFunction &MF, VirtRegMap &VRM,		InlineSpiller(MachineFunctionPass &Pass, MachineFunction &MF, VirtRegMap &VRM,
VirtRegAuxInfo &VRAI)		VirtRegAuxInfo &VRAI)
: MF(MF), LIS(Pass.getAnalysis<LiveIntervals>()),		: MF(MF), LIS(Pass.getAnalysis<LiveIntervals>()),
LSS(Pass.getAnalysis<LiveStacks>()),		LSS(Pass.getAnalysis<LiveStacks>()),
AA(&Pass.getAnalysis<AAResultsWrapperPass>().getAAResults()),
MDT(Pass.getAnalysis<MachineDominatorTree>()),		MDT(Pass.getAnalysis<MachineDominatorTree>()),
Loops(Pass.getAnalysis<MachineLoopInfo>()), VRM(VRM),		Loops(Pass.getAnalysis<MachineLoopInfo>()), VRM(VRM),
MRI(MF.getRegInfo()), TII(*MF.getSubtarget().getInstrInfo()),		MRI(MF.getRegInfo()), TII(*MF.getSubtarget().getInstrInfo()),
TRI(*MF.getSubtarget().getRegisterInfo()),		TRI(*MF.getSubtarget().getRegisterInfo()),
MBFI(Pass.getAnalysis<MachineBlockFrequencyInfo>()),		MBFI(Pass.getAnalysis<MachineBlockFrequencyInfo>()),
HSpiller(Pass, MF, VRM), VRAI(VRAI) {}		HSpiller(Pass, MF, VRM), VRAI(VRAI) {}

void spill(LiveRangeEdit &) override;		void spill(LiveRangeEdit &) override;
▲ Show 20 Lines • Show All 442 Lines • ▼ Show 20 Lines	bool InlineSpiller::reMaterializeFor(LiveInterval &VirtReg, MachineInstr &MI) {

++NumRemats;		++NumRemats;
return true;		return true;
}		}

/// reMaterializeAll - Try to rematerialize as many uses as possible,		/// reMaterializeAll - Try to rematerialize as many uses as possible,
/// and trim the live ranges after.		/// and trim the live ranges after.
void InlineSpiller::reMaterializeAll() {		void InlineSpiller::reMaterializeAll() {
if (!Edit->anyRematerializable(AA))		if (!Edit->anyRematerializable())
return;		return;

UsedValues.clear();		UsedValues.clear();

// Try to remat before all uses of snippets.		// Try to remat before all uses of snippets.
bool anyRemat = false;		bool anyRemat = false;
for (Register Reg : RegsToSpill) {		for (Register Reg : RegsToSpill) {
LiveInterval &LI = LIS.getInterval(Reg);		LiveInterval &LI = LIS.getInterval(Reg);
Show All 26 Lines	for (Register Reg : RegsToSpill) {
}		}
}		}

// Eliminate dead code after remat. Note that some snippet copies may be		// Eliminate dead code after remat. Note that some snippet copies may be
// deleted here.		// deleted here.
if (DeadDefs.empty())		if (DeadDefs.empty())
return;		return;
LLVM_DEBUG(dbgs() << "Remat created " << DeadDefs.size() << " dead defs.\n");		LLVM_DEBUG(dbgs() << "Remat created " << DeadDefs.size() << " dead defs.\n");
Edit->eliminateDeadDefs(DeadDefs, RegsToSpill, AA);		Edit->eliminateDeadDefs(DeadDefs, RegsToSpill);

// LiveRangeEdit::eliminateDeadDef is used to remove dead define instructions		// LiveRangeEdit::eliminateDeadDef is used to remove dead define instructions
// after rematerialization. To remove a VNI for a vreg from its LiveInterval,		// after rematerialization. To remove a VNI for a vreg from its LiveInterval,
// LiveIntervals::removeVRegDefAt is used. However, after non-PHI VNIs are all		// LiveIntervals::removeVRegDefAt is used. However, after non-PHI VNIs are all
// removed, PHI VNI are still left in the LiveInterval.		// removed, PHI VNI are still left in the LiveInterval.
// So to get rid of unused reg, we need to check whether it has non-dbg		// So to get rid of unused reg, we need to check whether it has non-dbg
// reference instead of whether it has non-empty interval.		// reference instead of whether it has non-empty interval.
unsigned ResultPos = 0;		unsigned ResultPos = 0;
▲ Show 20 Lines • Show All 461 Lines • ▼ Show 20 Lines	void InlineSpiller::spillAll() {

// Spill around uses of all RegsToSpill.		// Spill around uses of all RegsToSpill.
for (Register Reg : RegsToSpill)		for (Register Reg : RegsToSpill)
spillAroundUses(Reg);		spillAroundUses(Reg);

// Hoisted spills may cause dead code.		// Hoisted spills may cause dead code.
if (!DeadDefs.empty()) {		if (!DeadDefs.empty()) {
LLVM_DEBUG(dbgs() << "Eliminating " << DeadDefs.size() << " dead defs\n");		LLVM_DEBUG(dbgs() << "Eliminating " << DeadDefs.size() << " dead defs\n");
Edit->eliminateDeadDefs(DeadDefs, RegsToSpill, AA);		Edit->eliminateDeadDefs(DeadDefs, RegsToSpill);
}		}

// Finally delete the SnippetCopies.		// Finally delete the SnippetCopies.
for (Register Reg : RegsToSpill) {		for (Register Reg : RegsToSpill) {
for (MachineInstr &MI :		for (MachineInstr &MI :
llvm::make_early_inc_range(MRI.reg_instructions(Reg))) {		llvm::make_early_inc_range(MRI.reg_instructions(Reg))) {
assert(SnippetCopies.count(&MI) && "Remaining use wasn't a snippet copy");		assert(SnippetCopies.count(&MI) && "Remaining use wasn't a snippet copy");
// FIXME: Do this with a LiveRangeEdit callback.		// FIXME: Do this with a LiveRangeEdit callback.
▲ Show 20 Lines • Show All 420 Lines • ▼ Show 20 Lines	for (auto &Ent : MergeableSpills) {
for (auto const RMEnt : SpillsToRm) {		for (auto const RMEnt : SpillsToRm) {
RMEnt->setDesc(TII.get(TargetOpcode::KILL));		RMEnt->setDesc(TII.get(TargetOpcode::KILL));
for (unsigned i = RMEnt->getNumOperands(); i; --i) {		for (unsigned i = RMEnt->getNumOperands(); i; --i) {
MachineOperand &MO = RMEnt->getOperand(i - 1);		MachineOperand &MO = RMEnt->getOperand(i - 1);
if (MO.isReg() && MO.isImplicit() && MO.isDef() && !MO.isDead())		if (MO.isReg() && MO.isImplicit() && MO.isDef() && !MO.isDead())
RMEnt->removeOperand(i - 1);		RMEnt->removeOperand(i - 1);
}		}
}		}
Edit.eliminateDeadDefs(SpillsToRm, None, AA);		Edit.eliminateDeadDefs(SpillsToRm, None);
}		}
}		}

/// For VirtReg clone, the \p New register should have the same physreg or		/// For VirtReg clone, the \p New register should have the same physreg or
/// stackslot as the \p old register.		/// stackslot as the \p old register.
void HoistSpillHelper::LRE_DidCloneVirtReg(Register New, Register Old) {		void HoistSpillHelper::LRE_DidCloneVirtReg(Register New, Register Old) {
if (VRM.hasPhys(Old))		if (VRM.hasPhys(Old))
VRM.assignVirt2Phys(New, VRM.getPhys(Old));		VRM.assignVirt2Phys(New, VRM.getPhys(Old));
else if (VRM.getStackSlot(Old) != VirtRegMap::NO_STACK_SLOT)		else if (VRM.getStackSlot(Old) != VirtRegMap::NO_STACK_SLOT)
VRM.assignVirt2StackSlot(New, VRM.getStackSlot(Old));		VRM.assignVirt2StackSlot(New, VRM.getStackSlot(Old));
else		else
llvm_unreachable("VReg should be assigned either physreg or stackslot");		llvm_unreachable("VReg should be assigned either physreg or stackslot");
if (VRM.hasShape(Old))		if (VRM.hasShape(Old))
VRM.assignVirt2Shape(New, VRM.getShape(Old));		VRM.assignVirt2Shape(New, VRM.getShape(Old));
}		}

llvm/lib/CodeGen/LiveIntervals.cpp

Show All 13 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/CodeGen/LiveIntervals.h"		#include "llvm/CodeGen/LiveIntervals.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DepthFirstIterator.h"		#include "llvm/ADT/DepthFirstIterator.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/iterator_range.h"		#include "llvm/ADT/iterator_range.h"
#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/CodeGen/LiveInterval.h"		#include "llvm/CodeGen/LiveInterval.h"
#include "llvm/CodeGen/LiveIntervalCalc.h"		#include "llvm/CodeGen/LiveIntervalCalc.h"
#include "llvm/CodeGen/LiveVariables.h"		#include "llvm/CodeGen/LiveVariables.h"
#include "llvm/CodeGen/MachineBasicBlock.h"		#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"		#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"
#include "llvm/CodeGen/MachineDominators.h"		#include "llvm/CodeGen/MachineDominators.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineInstr.h"		#include "llvm/CodeGen/MachineInstr.h"
Show All 24 Lines
#include <utility>		#include <utility>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "regalloc"		#define DEBUG_TYPE "regalloc"

char LiveIntervals::ID = 0;		char LiveIntervals::ID = 0;
char &llvm::LiveIntervalsID = LiveIntervals::ID;		char &llvm::LiveIntervalsID = LiveIntervals::ID;
INITIALIZE_PASS_BEGIN(LiveIntervals, "liveintervals",		INITIALIZE_PASS_BEGIN(LiveIntervals, "liveintervals", "Live Interval Analysis",
"Live Interval Analysis", false, false)		false, false)
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree)		INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree)
INITIALIZE_PASS_DEPENDENCY(SlotIndexes)		INITIALIZE_PASS_DEPENDENCY(SlotIndexes)
INITIALIZE_PASS_END(LiveIntervals, "liveintervals",		INITIALIZE_PASS_END(LiveIntervals, "liveintervals",
"Live Interval Analysis", false, false)		"Live Interval Analysis", false, false)

#ifndef NDEBUG		#ifndef NDEBUG
static cl::opt<bool> EnablePrecomputePhysRegs(		static cl::opt<bool> EnablePrecomputePhysRegs(
"precompute-phys-liveness", cl::Hidden,		"precompute-phys-liveness", cl::Hidden,
cl::desc("Eagerly compute live intervals for all physreg units."));		cl::desc("Eagerly compute live intervals for all physreg units."));
#else		#else
static bool EnablePrecomputePhysRegs = false;		static bool EnablePrecomputePhysRegs = false;
#endif // NDEBUG		#endif // NDEBUG

namespace llvm {		namespace llvm {

cl::opt<bool> UseSegmentSetForPhysRegs(		cl::opt<bool> UseSegmentSetForPhysRegs(
"use-segment-set-for-physregs", cl::Hidden, cl::init(true),		"use-segment-set-for-physregs", cl::Hidden, cl::init(true),
cl::desc(		cl::desc(
"Use segment set for the computation of the live ranges of physregs."));		"Use segment set for the computation of the live ranges of physregs."));

} // end namespace llvm		} // end namespace llvm

void LiveIntervals::getAnalysisUsage(AnalysisUsage &AU) const {		void LiveIntervals::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesCFG();		AU.setPreservesCFG();
AU.addRequired<AAResultsWrapperPass>();
AU.addPreserved<AAResultsWrapperPass>();
AU.addPreserved<LiveVariables>();		AU.addPreserved<LiveVariables>();
AU.addPreservedID(MachineLoopInfoID);		AU.addPreservedID(MachineLoopInfoID);
AU.addRequiredTransitiveID(MachineDominatorsID);		AU.addRequiredTransitiveID(MachineDominatorsID);
AU.addPreservedID(MachineDominatorsID);		AU.addPreservedID(MachineDominatorsID);
AU.addPreserved<SlotIndexes>();		AU.addPreserved<SlotIndexes>();
AU.addRequiredTransitive<SlotIndexes>();		AU.addRequiredTransitive<SlotIndexes>();
MachineFunctionPass::getAnalysisUsage(AU);		MachineFunctionPass::getAnalysisUsage(AU);
}		}
Show All 21 Lines	void LiveIntervals::releaseMemory() {
VNInfoAllocator.Reset();		VNInfoAllocator.Reset();
}		}

bool LiveIntervals::runOnMachineFunction(MachineFunction &fn) {		bool LiveIntervals::runOnMachineFunction(MachineFunction &fn) {
MF = &fn;		MF = &fn;
MRI = &MF->getRegInfo();		MRI = &MF->getRegInfo();
TRI = MF->getSubtarget().getRegisterInfo();		TRI = MF->getSubtarget().getRegisterInfo();
TII = MF->getSubtarget().getInstrInfo();		TII = MF->getSubtarget().getInstrInfo();
AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
Indexes = &getAnalysis<SlotIndexes>();		Indexes = &getAnalysis<SlotIndexes>();
DomTree = &getAnalysis<MachineDominatorTree>();		DomTree = &getAnalysis<MachineDominatorTree>();

if (!LICalc)		if (!LICalc)
LICalc = new LiveIntervalCalc();		LICalc = new LiveIntervalCalc();

// Allocate space for all virtual registers.		// Allocate space for all virtual registers.
VirtRegIntervals.resize(MRI->getNumVirtRegs());		VirtRegIntervals.resize(MRI->getNumVirtRegs());
▲ Show 20 Lines • Show All 1,626 Lines • Show Last 20 Lines

llvm/lib/CodeGen/LiveRangeEdit.cpp

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	Register LiveRangeEdit::createFrom(Register OldReg) {
// LiveInterval when it gets created but we cannot do that at		// LiveInterval when it gets created but we cannot do that at
// the moment.		// the moment.
if (Parent && !Parent->isSpillable())		if (Parent && !Parent->isSpillable())
LIS.getInterval(VReg).markNotSpillable();		LIS.getInterval(VReg).markNotSpillable();
return VReg;		return VReg;
}		}

bool LiveRangeEdit::checkRematerializable(VNInfo *VNI,		bool LiveRangeEdit::checkRematerializable(VNInfo *VNI,
const MachineInstr *DefMI,		const MachineInstr *DefMI) {
AAResults *aa) {
assert(DefMI && "Missing instruction");		assert(DefMI && "Missing instruction");
ScannedRemattable = true;		ScannedRemattable = true;
if (!TII.isTriviallyReMaterializable(*DefMI, aa))		if (!TII.isTriviallyReMaterializable(*DefMI))
return false;		return false;
Remattable.insert(VNI);		Remattable.insert(VNI);
return true;		return true;
}		}

void LiveRangeEdit::scanRemattable(AAResults *aa) {		void LiveRangeEdit::scanRemattable() {
for (VNInfo *VNI : getParent().valnos) {		for (VNInfo *VNI : getParent().valnos) {
if (VNI->isUnused())		if (VNI->isUnused())
continue;		continue;
unsigned Original = VRM->getOriginal(getReg());		unsigned Original = VRM->getOriginal(getReg());
LiveInterval &OrigLI = LIS.getInterval(Original);		LiveInterval &OrigLI = LIS.getInterval(Original);
VNInfo *OrigVNI = OrigLI.getVNInfoAt(VNI->def);		VNInfo *OrigVNI = OrigLI.getVNInfoAt(VNI->def);
if (!OrigVNI)		if (!OrigVNI)
continue;		continue;
MachineInstr *DefMI = LIS.getInstructionFromIndex(OrigVNI->def);		MachineInstr *DefMI = LIS.getInstructionFromIndex(OrigVNI->def);
if (!DefMI)		if (!DefMI)
continue;		continue;
checkRematerializable(OrigVNI, DefMI, aa);		checkRematerializable(OrigVNI, DefMI);
}		}
ScannedRemattable = true;		ScannedRemattable = true;
}		}

bool LiveRangeEdit::anyRematerializable(AAResults *aa) {		bool LiveRangeEdit::anyRematerializable() {
if (!ScannedRemattable)		if (!ScannedRemattable)
scanRemattable(aa);		scanRemattable();
return !Remattable.empty();		return !Remattable.empty();
}		}

/// allUsesAvailableAt - Return true if all registers used by OrigMI at		/// allUsesAvailableAt - Return true if all registers used by OrigMI at
/// OrigIdx are also available with the same value at UseIdx.		/// OrigIdx are also available with the same value at UseIdx.
bool LiveRangeEdit::allUsesAvailableAt(const MachineInstr *OrigMI,		bool LiveRangeEdit::allUsesAvailableAt(const MachineInstr *OrigMI,
SlotIndex OrigIdx,		SlotIndex OrigIdx,
SlotIndex UseIdx) const {		SlotIndex UseIdx) const {
▲ Show 20 Lines • Show All 160 Lines • ▼ Show 20 Lines	bool LiveRangeEdit::useIsKill(const LiveInterval &LI,
for (const LiveInterval::SubRange &S : LI.subranges()) {		for (const LiveInterval::SubRange &S : LI.subranges()) {
if ((S.LaneMask & LaneMask).any() && S.Query(Idx).isKill())		if ((S.LaneMask & LaneMask).any() && S.Query(Idx).isKill())
return true;		return true;
}		}
return false;		return false;
}		}

/// Find all live intervals that need to shrink, then remove the instruction.		/// Find all live intervals that need to shrink, then remove the instruction.
void LiveRangeEdit::eliminateDeadDef(MachineInstr *MI, ToShrinkSet &ToShrink,		void LiveRangeEdit::eliminateDeadDef(MachineInstr *MI, ToShrinkSet &ToShrink) {
AAResults *AA) {
assert(MI->allDefsAreDead() && "Def isn't really dead");		assert(MI->allDefsAreDead() && "Def isn't really dead");
SlotIndex Idx = LIS.getInstructionIndex(*MI).getRegSlot();		SlotIndex Idx = LIS.getInstructionIndex(*MI).getRegSlot();

// Never delete a bundled instruction.		// Never delete a bundled instruction.
if (MI->isBundled()) {		if (MI->isBundled()) {
return;		return;
}		}
// Never delete inline asm.		// Never delete inline asm.
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	if (ReadsPhysRegs) {
// don't delete the inst. Replace the dest with a new reg, and keep		// don't delete the inst. Replace the dest with a new reg, and keep
// the inst for remat of other siblings. The inst is saved in		// the inst for remat of other siblings. The inst is saved in
// LiveRangeEdit::DeadRemats and will be deleted after all the		// LiveRangeEdit::DeadRemats and will be deleted after all the
// allocations of the func are done.		// allocations of the func are done.
// However, immediately delete instructions which have unshrunk virtual		// However, immediately delete instructions which have unshrunk virtual
// register uses. That may provoke RA to split an interval at the KILL		// register uses. That may provoke RA to split an interval at the KILL
// and later result in an invalid live segment end.		// and later result in an invalid live segment end.
if (isOrigDef && DeadRemats && !HasLiveVRegUses &&		if (isOrigDef && DeadRemats && !HasLiveVRegUses &&
TII.isTriviallyReMaterializable(*MI, AA)) {		TII.isTriviallyReMaterializable(*MI)) {
LiveInterval &NewLI = createEmptyIntervalFrom(Dest, false);		LiveInterval &NewLI = createEmptyIntervalFrom(Dest, false);
VNInfo *VNI = NewLI.getNextValue(Idx, LIS.getVNInfoAllocator());		VNInfo *VNI = NewLI.getNextValue(Idx, LIS.getVNInfoAllocator());
NewLI.addSegment(LiveInterval::Segment(Idx, Idx.getDeadSlot(), VNI));		NewLI.addSegment(LiveInterval::Segment(Idx, Idx.getDeadSlot(), VNI));
pop_back();		pop_back();
DeadRemats->insert(MI);		DeadRemats->insert(MI);
const TargetRegisterInfo &TRI = *MRI.getTargetRegisterInfo();		const TargetRegisterInfo &TRI = *MRI.getTargetRegisterInfo();
MI->substituteRegister(Dest, NewLI.reg(), 0, TRI);		MI->substituteRegister(Dest, NewLI.reg(), 0, TRI);
MI->getOperand(0).setIsDead(true);		MI->getOperand(0).setIsDead(true);
Show All 13 Lines	for (unsigned i = 0, e = RegsToErase.size(); i != e; ++i) {
if (LIS.hasInterval(Reg) && MRI.reg_nodbg_empty(Reg)) {		if (LIS.hasInterval(Reg) && MRI.reg_nodbg_empty(Reg)) {
ToShrink.remove(&LIS.getInterval(Reg));		ToShrink.remove(&LIS.getInterval(Reg));
eraseVirtReg(Reg);		eraseVirtReg(Reg);
}		}
}		}
}		}

void LiveRangeEdit::eliminateDeadDefs(SmallVectorImpl<MachineInstr *> &Dead,		void LiveRangeEdit::eliminateDeadDefs(SmallVectorImpl<MachineInstr *> &Dead,
ArrayRef<Register> RegsBeingSpilled,		ArrayRef<Register> RegsBeingSpilled) {
AAResults *AA) {
ToShrinkSet ToShrink;		ToShrinkSet ToShrink;

for (;;) {		for (;;) {
// Erase all dead defs.		// Erase all dead defs.
while (!Dead.empty())		while (!Dead.empty())
eliminateDeadDef(Dead.pop_back_val(), ToShrink, AA);		eliminateDeadDef(Dead.pop_back_val(), ToShrink);

if (ToShrink.empty())		if (ToShrink.empty())
break;		break;

// Shrink just one live interval. Then delete new dead defs.		// Shrink just one live interval. Then delete new dead defs.
LiveInterval *LI = ToShrink.pop_back_val();		LiveInterval *LI = ToShrink.pop_back_val();
if (foldAsLoad(LI, Dead))		if (foldAsLoad(LI, Dead))
continue;		continue;
▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp

//===- MLRegAllocEvictAdvisor.cpp - ML eviction advisor -------------------===//		//===- MLRegAllocEvictAdvisor.cpp - ML eviction advisor -------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Implementation of the ML eviction advisor and reward injection pass		// Implementation of the ML eviction advisor and reward injection pass
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "AllocationOrder.h"		#include "AllocationOrder.h"
#include "RegAllocEvictionAdvisor.h"		#include "RegAllocEvictionAdvisor.h"
#include "RegAllocGreedy.h"		#include "RegAllocGreedy.h"
#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/MLModelRunner.h"		#include "llvm/Analysis/MLModelRunner.h"
#include "llvm/Analysis/TensorSpec.h"		#include "llvm/Analysis/TensorSpec.h"
#if defined(LLVM_HAVE_TF_AOT_REGALLOCEVICTMODEL) \|\| defined(LLVM_HAVE_TF_API)		#if defined(LLVM_HAVE_TF_AOT_REGALLOCEVICTMODEL) \|\| defined(LLVM_HAVE_TF_API)
#include "llvm/Analysis/ModelUnderTrainingRunner.h"		#include "llvm/Analysis/ModelUnderTrainingRunner.h"
#include "llvm/Analysis/NoInferenceModelRunner.h"		#include "llvm/Analysis/NoInferenceModelRunner.h"
#endif		#endif
#include "llvm/Analysis/ReleaseModeModelRunner.h"		#include "llvm/Analysis/ReleaseModeModelRunner.h"
#include "llvm/CodeGen/CalcSpillWeights.h"		#include "llvm/CodeGen/CalcSpillWeights.h"
#include "llvm/CodeGen/LiveRegMatrix.h"		#include "llvm/CodeGen/LiveRegMatrix.h"
#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"		#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	StringRef getPassName() const override {
return "Register Allocation Pass Scoring";		return "Register Allocation Pass Scoring";
}		}

/// RegAllocReward analysis usage.		/// RegAllocReward analysis usage.
void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesAll();		AU.setPreservesAll();
AU.addRequired<RegAllocEvictionAdvisorAnalysis>();		AU.addRequired<RegAllocEvictionAdvisorAnalysis>();
AU.addRequired<MachineBlockFrequencyInfo>();		AU.addRequired<MachineBlockFrequencyInfo>();
AU.addRequired<AAResultsWrapperPass>();
MachineFunctionPass::getAnalysisUsage(AU);		MachineFunctionPass::getAnalysisUsage(AU);
}		}

/// Performs this pass		/// Performs this pass
bool runOnMachineFunction(MachineFunction &) override;		bool runOnMachineFunction(MachineFunction &) override;
};		};

char RegAllocScoring::ID = 0;		char RegAllocScoring::ID = 0;
▲ Show 20 Lines • Show All 779 Lines • ▼ Show 20 Lines	for (size_t I = 1; I < MUTR->outputLoggedFeatureSpecs().size();
reinterpret_cast<const char *>(		reinterpret_cast<const char *>(
MUTR->lastEvaluationResult()->getUntypedTensorValue(I)));		MUTR->lastEvaluationResult()->getUntypedTensorValue(I)));
// The output is right after the features and the extra outputs		// The output is right after the features and the extra outputs
Log->logInt64Value(CurrentFeature, &Ret);		Log->logInt64Value(CurrentFeature, &Ret);
return Ret;		return Ret;
}		}

bool RegAllocScoring::runOnMachineFunction(MachineFunction &MF) {		bool RegAllocScoring::runOnMachineFunction(MachineFunction &MF) {
if (auto *DevModeAnalysis = dyn_cast<DevelopmentModeEvictionAdvisorAnalysis>(		if (auto *DevModeAnalysis = dyn_cast<DevelopmentModeEvictionAdvisorAnalysis>(
		mtrofinUnsubmitted Done Reply Inline Actions I think this isn't needed here. mtrofin: I think this isn't needed here.
&getAnalysis<RegAllocEvictionAdvisorAnalysis>()))		&getAnalysis<RegAllocEvictionAdvisorAnalysis>()))
if (auto *Log = DevModeAnalysis->getLogger(MF))		if (auto *Log = DevModeAnalysis->getLogger(MF))
Log->logFloatFinalReward(static_cast<float>(		Log->logFloatFinalReward(static_cast<float>(
calculateRegAllocScore(		calculateRegAllocScore(MF, getAnalysis<MachineBlockFrequencyInfo>())
		mtrofinUnsubmitted Done Reply Inline Actions this can be removed, since calculateRegAllocScore now doesn't need the AA mtrofin: this can be removed, since calculateRegAllocScore now doesn't need the AA
MF, getAnalysis<MachineBlockFrequencyInfo>(),
getAnalysis<AAResultsWrapperPass>().getAAResults())
.getScore()));		.getScore()));

return false;		return false;
}		}
#endif // #ifdef LLVM_HAVE_TF_API		#endif // #ifdef LLVM_HAVE_TF_API

RegAllocEvictionAdvisorAnalysis *llvm::createReleaseModeAdvisor() {		RegAllocEvictionAdvisorAnalysis *llvm::createReleaseModeAdvisor() {
return new ReleaseModeEvictionAdvisorAnalysis();		return new ReleaseModeEvictionAdvisorAnalysis();
}		}

// In all cases except development mode, we don't need scoring.		// In all cases except development mode, we don't need scoring.
#if !defined(LLVM_HAVE_TF_API)		#if !defined(LLVM_HAVE_TF_API)
bool RegAllocScoring::runOnMachineFunction(MachineFunction &) { return false; }		bool RegAllocScoring::runOnMachineFunction(MachineFunction &) { return false; }
#endif		#endif

llvm/lib/CodeGen/MachineCSE.cpp

Show First 20 Lines • Show All 409 Lines • ▼ Show 20 Lines	bool MachineCSE::isCSECandidate(MachineInstr *MI) {
if (MI->mayStore() \|\| MI->isCall() \|\| MI->isTerminator() \|\|		if (MI->mayStore() \|\| MI->isCall() \|\| MI->isTerminator() \|\|
MI->mayRaiseFPException() \|\| MI->hasUnmodeledSideEffects())		MI->mayRaiseFPException() \|\| MI->hasUnmodeledSideEffects())
return false;		return false;

if (MI->mayLoad()) {		if (MI->mayLoad()) {
// Okay, this instruction does a load. As a refinement, we allow the target		// Okay, this instruction does a load. As a refinement, we allow the target
// to decide whether the loaded value is actually a constant. If so, we can		// to decide whether the loaded value is actually a constant. If so, we can
// actually use it as a load.		// actually use it as a load.
if (!MI->isDereferenceableInvariantLoad(AA))		if (!MI->isDereferenceableInvariantLoad())
// FIXME: we should be able to hoist loads with no other side effects if		// FIXME: we should be able to hoist loads with no other side effects if
// there are no other instructions which can change memory in this loop.		// there are no other instructions which can change memory in this loop.
// This is a trivial form of alias analysis.		// This is a trivial form of alias analysis.
return false;		return false;
}		}

// Ignore stack guard loads, otherwise the register that holds CSEed value may		// Ignore stack guard loads, otherwise the register that holds CSEed value may
// be spilled and get loaded back with corrupted data.		// be spilled and get loaded back with corrupted data.
▲ Show 20 Lines • Show All 498 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachineInstr.cpp

Show First 20 Lines • Show All 1,197 Lines • ▼ Show 20 Lines	if (isPosition() \|\| isDebugInstr() \|\| isTerminator() \|\|
mayRaiseFPException() \|\| hasUnmodeledSideEffects())		mayRaiseFPException() \|\| hasUnmodeledSideEffects())
return false;		return false;

// See if this instruction does a load. If so, we have to guarantee that the		// See if this instruction does a load. If so, we have to guarantee that the
// loaded value doesn't change between the load and the its intended		// loaded value doesn't change between the load and the its intended
// destination. The check for isInvariantLoad gives the target the chance to		// destination. The check for isInvariantLoad gives the target the chance to
// classify the load as always returning a constant, e.g. a constant pool		// classify the load as always returning a constant, e.g. a constant pool
// load.		// load.
if (mayLoad() && !isDereferenceableInvariantLoad(AA))		if (mayLoad() && !isDereferenceableInvariantLoad())
// Otherwise, this is a real load. If there is a store between the load and		// Otherwise, this is a real load. If there is a store between the load and
// end of block, we can't move it.		// end of block, we can't move it.
return !SawStore;		return !SawStore;

return true;		return true;
}		}

static bool MemOperandsHaveAlias(const MachineFrameInfo &MFI, AAResults *AA,		static bool MemOperandsHaveAlias(const MachineFrameInfo &MFI, AAResults *AA,
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	bool MachineInstr::hasOrderedMemoryRef() const {
return llvm::any_of(memoperands(), [](const MachineMemOperand *MMO) {		return llvm::any_of(memoperands(), [](const MachineMemOperand *MMO) {
return !MMO->isUnordered();		return !MMO->isUnordered();
});		});
}		}

/// isDereferenceableInvariantLoad - Return true if this instruction will never		/// isDereferenceableInvariantLoad - Return true if this instruction will never
/// trap and is loading from a location whose value is invariant across a run of		/// trap and is loading from a location whose value is invariant across a run of
/// this function.		/// this function.
bool MachineInstr::isDereferenceableInvariantLoad(AAResults *AA) const {		bool MachineInstr::isDereferenceableInvariantLoad() const {
// If the instruction doesn't load at all, it isn't an invariant load.		// If the instruction doesn't load at all, it isn't an invariant load.
if (!mayLoad())		if (!mayLoad())
return false;		return false;

// If the instruction has lost its memoperands, conservatively assume that		// If the instruction has lost its memoperands, conservatively assume that
// it may not be an invariant load.		// it may not be an invariant load.
if (memoperands_empty())		if (memoperands_empty())
return false;		return false;
Show All 9 Lines	for (MachineMemOperand *MMO : memoperands()) {
if (MMO->isStore()) return false;		if (MMO->isStore()) return false;
if (MMO->isInvariant() && MMO->isDereferenceable())		if (MMO->isInvariant() && MMO->isDereferenceable())
continue;		continue;

// A load from a constant PseudoSourceValue is invariant.		// A load from a constant PseudoSourceValue is invariant.
if (const PseudoSourceValue *PSV = MMO->getPseudoValue()) {		if (const PseudoSourceValue *PSV = MMO->getPseudoValue()) {
if (PSV->isConstant(&MFI))		if (PSV->isConstant(&MFI))
continue;		continue;
} else if (const Value *V = MMO->getValue()) {
// If we have an AliasAnalysis, ask it whether the memory is constant.
if (AA &&
AA->pointsToConstantMemory(
MemoryLocation(V, MMO->getSize(), MMO->getAAInfo())))
continue;
}		}

// Otherwise assume conservatively.		// Otherwise assume conservatively.
return false;		return false;
}		}

// Everything checks out.		// Everything checks out.
return true;		return true;
▲ Show 20 Lines • Show All 942 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachineLICM.cpp

Show First 20 Lines • Show All 224 Lines • ▼ Show 20 Lines	bool CanCauseHighRegPressure(const DenseMap<unsigned, int> &Cost,
bool Cheap);		bool Cheap);

void UpdateBackTraceRegPressure(const MachineInstr *MI);		void UpdateBackTraceRegPressure(const MachineInstr *MI);

bool IsProfitableToHoist(MachineInstr &MI);		bool IsProfitableToHoist(MachineInstr &MI);

bool IsGuaranteedToExecute(MachineBasicBlock *BB);		bool IsGuaranteedToExecute(MachineBasicBlock *BB);

bool isTriviallyReMaterializable(const MachineInstr &MI,		bool isTriviallyReMaterializable(const MachineInstr &MI) const;
AAResults *AA) const;

void EnterScope(MachineBasicBlock *MBB);		void EnterScope(MachineBasicBlock *MBB);

void ExitScope(MachineBasicBlock *MBB);		void ExitScope(MachineBasicBlock *MBB);

void ExitScopeIfDone(		void ExitScopeIfDone(
MachineDomTreeNode *Node,		MachineDomTreeNode *Node,
DenseMap<MachineDomTreeNode *, unsigned> &OpenChildren,		DenseMap<MachineDomTreeNode *, unsigned> &OpenChildren,
▲ Show 20 Lines • Show All 418 Lines • ▼ Show 20 Lines	bool MachineLICMBase::IsGuaranteedToExecute(MachineBasicBlock *BB) {
SpeculationState = SpeculateFalse;		SpeculationState = SpeculateFalse;
return true;		return true;
}		}

/// Check if \p MI is trivially remateralizable and if it does not have any		/// Check if \p MI is trivially remateralizable and if it does not have any
/// virtual register uses. Even though rematerializable RA might not actually		/// virtual register uses. Even though rematerializable RA might not actually
/// rematerialize it in this scenario. In that case we do not want to hoist such		/// rematerialize it in this scenario. In that case we do not want to hoist such
/// instruction out of the loop in a belief RA will sink it back if needed.		/// instruction out of the loop in a belief RA will sink it back if needed.
bool MachineLICMBase::isTriviallyReMaterializable(const MachineInstr &MI,		bool MachineLICMBase::isTriviallyReMaterializable(
AAResults *AA) const {		const MachineInstr &MI) const {
if (!TII->isTriviallyReMaterializable(MI, AA))		if (!TII->isTriviallyReMaterializable(MI))
return false;		return false;

for (const MachineOperand &MO : MI.operands()) {		for (const MachineOperand &MO : MI.operands()) {
if (MO.isReg() && MO.isUse() && MO.getReg().isVirtual())		if (MO.isReg() && MO.isUse() && MO.getReg().isVirtual())
return false;		return false;
}		}

return true;		return true;
▲ Show 20 Lines • Show All 489 Lines • ▼ Show 20 Lines	bool MachineLICMBase::IsProfitableToHoist(MachineInstr &MI) {
// Don't hoist a cheap instruction if it would create a copy in the loop.		// Don't hoist a cheap instruction if it would create a copy in the loop.
if (CheapInstr && CreatesCopy) {		if (CheapInstr && CreatesCopy) {
LLVM_DEBUG(dbgs() << "Won't hoist cheap instr with loop PHI use: " << MI);		LLVM_DEBUG(dbgs() << "Won't hoist cheap instr with loop PHI use: " << MI);
return false;		return false;
}		}

// Rematerializable instructions should always be hoisted providing the		// Rematerializable instructions should always be hoisted providing the
// register allocator can just pull them down again when needed.		// register allocator can just pull them down again when needed.
if (isTriviallyReMaterializable(MI, AA))		if (isTriviallyReMaterializable(MI))
return true;		return true;

// FIXME: If there are long latency loop-invariant instructions inside the		// FIXME: If there are long latency loop-invariant instructions inside the
// loop at this point, why didn't the optimizer's LICM hoist them?		// loop at this point, why didn't the optimizer's LICM hoist them?
for (unsigned i = 0, e = MI.getDesc().getNumOperands(); i != e; ++i) {		for (unsigned i = 0, e = MI.getDesc().getNumOperands(); i != e; ++i) {
const MachineOperand &MO = MI.getOperand(i);		const MachineOperand &MO = MI.getOperand(i);
if (!MO.isReg() \|\| MO.isImplicit())		if (!MO.isReg() \|\| MO.isImplicit())
continue;		continue;
Show All 36 Lines	bool MachineLICMBase::IsProfitableToHoist(MachineInstr &MI) {
if (AvoidSpeculation &&		if (AvoidSpeculation &&
(!IsGuaranteedToExecute(MI.getParent()) && !MayCSE(&MI))) {		(!IsGuaranteedToExecute(MI.getParent()) && !MayCSE(&MI))) {
LLVM_DEBUG(dbgs() << "Won't speculate: " << MI);		LLVM_DEBUG(dbgs() << "Won't speculate: " << MI);
return false;		return false;
}		}

// High register pressure situation, only hoist if the instruction is going		// High register pressure situation, only hoist if the instruction is going
// to be remat'ed.		// to be remat'ed.
if (!isTriviallyReMaterializable(MI, AA) &&		if (!isTriviallyReMaterializable(MI) &&
!MI.isDereferenceableInvariantLoad(AA)) {		!MI.isDereferenceableInvariantLoad()) {
LLVM_DEBUG(dbgs() << "Can't remat / high reg-pressure: " << MI);		LLVM_DEBUG(dbgs() << "Can't remat / high reg-pressure: " << MI);
return false;		return false;
}		}

return true;		return true;
}		}

/// Unfold a load from the given machineinstr if the load itself could be		/// Unfold a load from the given machineinstr if the load itself could be
/// hoisted. Return the unfolded and hoistable load, or null if the load		/// hoisted. Return the unfolded and hoistable load, or null if the load
/// couldn't be unfolded or if it wouldn't be hoistable.		/// couldn't be unfolded or if it wouldn't be hoistable.
MachineInstr MachineLICMBase::ExtractHoistableLoad(MachineInstr MI) {		MachineInstr MachineLICMBase::ExtractHoistableLoad(MachineInstr MI) {
// Don't unfold simple loads.		// Don't unfold simple loads.
if (MI->canFoldAsLoad())		if (MI->canFoldAsLoad())
return nullptr;		return nullptr;

// If not, we may be able to unfold a load and hoist that.		// If not, we may be able to unfold a load and hoist that.
// First test whether the instruction is loading from an amenable		// First test whether the instruction is loading from an amenable
// memory location.		// memory location.
if (!MI->isDereferenceableInvariantLoad(AA))		if (!MI->isDereferenceableInvariantLoad())
return nullptr;		return nullptr;

// Next determine the register class for a temporary register.		// Next determine the register class for a temporary register.
unsigned LoadRegIndex;		unsigned LoadRegIndex;
unsigned NewOpc =		unsigned NewOpc =
TII->getOpcodeAfterMemoryUnfold(MI->getOpcode(),		TII->getOpcodeAfterMemoryUnfold(MI->getOpcode(),
/UnfoldLoad=/true,		/UnfoldLoad=/true,
/UnfoldStore=/false,		/UnfoldStore=/false,
▲ Show 20 Lines • Show All 266 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachinePipeliner.cpp

Show First 20 Lines • Show All 700 Lines • ▼ Show 20 Lines	for (auto &SI : SU->Succs) {
}		}
}		}
}		}
return false;		return false;
}		}

/// Return true if the instruction causes a chain between memory		/// Return true if the instruction causes a chain between memory
/// references before and after it.		/// references before and after it.
static bool isDependenceBarrier(MachineInstr &MI, AliasAnalysis *AA) {		static bool isDependenceBarrier(MachineInstr &MI) {
return MI.isCall() \|\| MI.mayRaiseFPException() \|\|		return MI.isCall() \|\| MI.mayRaiseFPException() \|\|
MI.hasUnmodeledSideEffects() \|\|		MI.hasUnmodeledSideEffects() \|\|
(MI.hasOrderedMemoryRef() &&		(MI.hasOrderedMemoryRef() &&
(!MI.mayLoad() \|\| !MI.isDereferenceableInvariantLoad(AA)));		(!MI.mayLoad() \|\| !MI.isDereferenceableInvariantLoad()));
}		}

/// Return the underlying objects for the memory references of an instruction.		/// Return the underlying objects for the memory references of an instruction.
/// This function calls the code in ValueTracking, but first checks that the		/// This function calls the code in ValueTracking, but first checks that the
/// instruction has a memory operand.		/// instruction has a memory operand.
static void getUnderlyingObjects(const MachineInstr *MI,		static void getUnderlyingObjects(const MachineInstr *MI,
SmallVectorImpl<const Value *> &Objs) {		SmallVectorImpl<const Value *> &Objs) {
if (!MI->hasOneMemOperand())		if (!MI->hasOneMemOperand())
Show All 16 Lines
/// dependence. This code is very similar to the code in ScheduleDAGInstrs		/// dependence. This code is very similar to the code in ScheduleDAGInstrs
/// but that code doesn't create loop carried dependences.		/// but that code doesn't create loop carried dependences.
void SwingSchedulerDAG::addLoopCarriedDependences(AliasAnalysis *AA) {		void SwingSchedulerDAG::addLoopCarriedDependences(AliasAnalysis *AA) {
MapVector<const Value , SmallVector<SUnit , 4>> PendingLoads;		MapVector<const Value , SmallVector<SUnit , 4>> PendingLoads;
Value *UnknownValue =		Value *UnknownValue =
UndefValue::get(Type::getVoidTy(MF.getFunction().getContext()));		UndefValue::get(Type::getVoidTy(MF.getFunction().getContext()));
for (auto &SU : SUnits) {		for (auto &SU : SUnits) {
MachineInstr &MI = *SU.getInstr();		MachineInstr &MI = *SU.getInstr();
if (isDependenceBarrier(MI, AA))		if (isDependenceBarrier(MI))
PendingLoads.clear();		PendingLoads.clear();
else if (MI.mayLoad()) {		else if (MI.mayLoad()) {
SmallVector<const Value *, 4> Objs;		SmallVector<const Value *, 4> Objs;
::getUnderlyingObjects(&MI, Objs);		::getUnderlyingObjects(&MI, Objs);
if (Objs.empty())		if (Objs.empty())
Objs.push_back(UnknownValue);		Objs.push_back(UnknownValue);
for (auto V : Objs) {		for (auto V : Objs) {
SmallVector<SUnit *, 4> &SUs = PendingLoads[V];		SmallVector<SUnit *, 4> &SUs = PendingLoads[V];
▲ Show 20 Lines • Show All 2,405 Lines • Show Last 20 Lines

llvm/lib/CodeGen/RegAllocBasic.cpp

	Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines
	INITIALIZE_PASS_BEGIN(RABasic, "regallocbasic", "Basic Register Allocator",			INITIALIZE_PASS_BEGIN(RABasic, "regallocbasic", "Basic Register Allocator",
	false, false)			false, false)
	INITIALIZE_PASS_DEPENDENCY(LiveDebugVariables)			INITIALIZE_PASS_DEPENDENCY(LiveDebugVariables)
	INITIALIZE_PASS_DEPENDENCY(SlotIndexes)			INITIALIZE_PASS_DEPENDENCY(SlotIndexes)
	INITIALIZE_PASS_DEPENDENCY(LiveIntervals)			INITIALIZE_PASS_DEPENDENCY(LiveIntervals)
	INITIALIZE_PASS_DEPENDENCY(RegisterCoalescer)			INITIALIZE_PASS_DEPENDENCY(RegisterCoalescer)
	INITIALIZE_PASS_DEPENDENCY(MachineScheduler)			INITIALIZE_PASS_DEPENDENCY(MachineScheduler)
	INITIALIZE_PASS_DEPENDENCY(LiveStacks)			INITIALIZE_PASS_DEPENDENCY(LiveStacks)
				INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree)			INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree)
	INITIALIZE_PASS_DEPENDENCY(MachineLoopInfo)			INITIALIZE_PASS_DEPENDENCY(MachineLoopInfo)
	INITIALIZE_PASS_DEPENDENCY(VirtRegMap)			INITIALIZE_PASS_DEPENDENCY(VirtRegMap)
	INITIALIZE_PASS_DEPENDENCY(LiveRegMatrix)			INITIALIZE_PASS_DEPENDENCY(LiveRegMatrix)
	INITIALIZE_PASS_END(RABasic, "regallocbasic", "Basic Register Allocator", false,			INITIALIZE_PASS_END(RABasic, "regallocbasic", "Basic Register Allocator", false,
	false)			false)

	bool RABasic::LRE_CanEraseVirtReg(Register VirtReg) {			bool RABasic::LRE_CanEraseVirtReg(Register VirtReg) {
	▲ Show 20 Lines • Show All 193 Lines • Show Last 20 Lines

llvm/lib/CodeGen/RegAllocGreedy.h

Show All 19 Lines
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/IndexedMap.h"		#include "llvm/ADT/IndexedMap.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/CodeGen/CalcSpillWeights.h"		#include "llvm/CodeGen/CalcSpillWeights.h"
#include "llvm/CodeGen/LiveInterval.h"		#include "llvm/CodeGen/LiveInterval.h"
#include "llvm/CodeGen/LiveRangeEdit.h"		#include "llvm/CodeGen/LiveRangeEdit.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/RegisterClassInfo.h"		#include "llvm/CodeGen/RegisterClassInfo.h"
#include "llvm/CodeGen/Spiller.h"		#include "llvm/CodeGen/Spiller.h"
#include "llvm/CodeGen/TargetRegisterInfo.h"		#include "llvm/CodeGen/TargetRegisterInfo.h"
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	private:
SlotIndexes *Indexes;		SlotIndexes *Indexes;
MachineBlockFrequencyInfo *MBFI;		MachineBlockFrequencyInfo *MBFI;
MachineDominatorTree *DomTree;		MachineDominatorTree *DomTree;
MachineLoopInfo *Loops;		MachineLoopInfo *Loops;
MachineOptimizationRemarkEmitter *ORE;		MachineOptimizationRemarkEmitter *ORE;
EdgeBundles *Bundles;		EdgeBundles *Bundles;
SpillPlacement *SpillPlacer;		SpillPlacement *SpillPlacer;
LiveDebugVariables *DebugVars;		LiveDebugVariables *DebugVars;
AliasAnalysis *AA;

// state		// state
std::unique_ptr<Spiller> SpillerInstance;		std::unique_ptr<Spiller> SpillerInstance;
PQueue Queue;		PQueue Queue;
std::unique_ptr<VirtRegAuxInfo> VRAI;		std::unique_ptr<VirtRegAuxInfo> VRAI;
Optional<ExtraRegInfo> ExtraInfo;		Optional<ExtraRegInfo> ExtraInfo;
std::unique_ptr<RegAllocEvictionAdvisor> EvictAdvisor;		std::unique_ptr<RegAllocEvictionAdvisor> EvictAdvisor;

▲ Show 20 Lines • Show All 252 Lines • Show Last 20 Lines

llvm/lib/CodeGen/RegAllocGreedy.cpp

Show First 20 Lines • Show All 196 Lines • ▼ Show 20 Lines	RAGreedy::RAGreedy(RegClassFilterFunc F):
MachineFunctionPass(ID),		MachineFunctionPass(ID),
RegAllocBase(F) {		RegAllocBase(F) {
}		}

void RAGreedy::getAnalysisUsage(AnalysisUsage &AU) const {		void RAGreedy::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesCFG();		AU.setPreservesCFG();
AU.addRequired<MachineBlockFrequencyInfo>();		AU.addRequired<MachineBlockFrequencyInfo>();
AU.addPreserved<MachineBlockFrequencyInfo>();		AU.addPreserved<MachineBlockFrequencyInfo>();
AU.addRequired<AAResultsWrapperPass>();
AU.addPreserved<AAResultsWrapperPass>();
AU.addRequired<LiveIntervals>();		AU.addRequired<LiveIntervals>();
AU.addPreserved<LiveIntervals>();		AU.addPreserved<LiveIntervals>();
AU.addRequired<SlotIndexes>();		AU.addRequired<SlotIndexes>();
AU.addPreserved<SlotIndexes>();		AU.addPreserved<SlotIndexes>();
AU.addRequired<LiveDebugVariables>();		AU.addRequired<LiveDebugVariables>();
AU.addPreserved<LiveDebugVariables>();		AU.addPreserved<LiveDebugVariables>();
AU.addRequired<LiveStacks>();		AU.addRequired<LiveStacks>();
AU.addPreserved<LiveStacks>();		AU.addPreserved<LiveStacks>();
▲ Show 20 Lines • Show All 2,289 Lines • ▼ Show 20 Lines	bool RAGreedy::runOnMachineFunction(MachineFunction &mf) {
Indexes = &getAnalysis<SlotIndexes>();		Indexes = &getAnalysis<SlotIndexes>();
MBFI = &getAnalysis<MachineBlockFrequencyInfo>();		MBFI = &getAnalysis<MachineBlockFrequencyInfo>();
DomTree = &getAnalysis<MachineDominatorTree>();		DomTree = &getAnalysis<MachineDominatorTree>();
ORE = &getAnalysis<MachineOptimizationRemarkEmitterPass>().getORE();		ORE = &getAnalysis<MachineOptimizationRemarkEmitterPass>().getORE();
Loops = &getAnalysis<MachineLoopInfo>();		Loops = &getAnalysis<MachineLoopInfo>();
Bundles = &getAnalysis<EdgeBundles>();		Bundles = &getAnalysis<EdgeBundles>();
SpillPlacer = &getAnalysis<SpillPlacement>();		SpillPlacer = &getAnalysis<SpillPlacement>();
DebugVars = &getAnalysis<LiveDebugVariables>();		DebugVars = &getAnalysis<LiveDebugVariables>();
AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();

initializeCSRCost();		initializeCSRCost();

RegCosts = TRI->getRegisterCosts(*MF);		RegCosts = TRI->getRegisterCosts(*MF);
RegClassPriorityTrumpsGlobalness =		RegClassPriorityTrumpsGlobalness =
GreedyRegClassPriorityTrumpsGlobalness.getNumOccurrences()		GreedyRegClassPriorityTrumpsGlobalness.getNumOccurrences()
? GreedyRegClassPriorityTrumpsGlobalness		? GreedyRegClassPriorityTrumpsGlobalness
: TRI->regClassPriorityTrumpsGlobalness(*MF);		: TRI->regClassPriorityTrumpsGlobalness(*MF);

ExtraInfo.emplace();		ExtraInfo.emplace();
EvictAdvisor =		EvictAdvisor =
getAnalysis<RegAllocEvictionAdvisorAnalysis>().getAdvisor(MF, this);		getAnalysis<RegAllocEvictionAdvisorAnalysis>().getAdvisor(MF, this);

VRAI = std::make_unique<VirtRegAuxInfo>(MF, LIS, VRM, Loops, *MBFI);		VRAI = std::make_unique<VirtRegAuxInfo>(MF, LIS, VRM, Loops, *MBFI);
SpillerInstance.reset(createInlineSpiller(this, MF, VRM, VRAI));		SpillerInstance.reset(createInlineSpiller(this, MF, VRM, VRAI));

VRAI->calculateSpillWeightsAndHints();		VRAI->calculateSpillWeightsAndHints();

LLVM_DEBUG(LIS->dump());		LLVM_DEBUG(LIS->dump());

SA.reset(new SplitAnalysis(VRM, LIS, *Loops));		SA.reset(new SplitAnalysis(VRM, LIS, *Loops));
SE.reset(new SplitEditor(SA, AA, LIS, VRM, DomTree, MBFI, *VRAI));		SE.reset(new SplitEditor(SA, LIS, VRM, DomTree, MBFI, VRAI));

IntfCache.init(MF, Matrix->getLiveUnions(), Indexes, LIS, TRI);		IntfCache.init(MF, Matrix->getLiveUnions(), Indexes, LIS, TRI);
GlobalCand.resize(32); // This will grow as needed.		GlobalCand.resize(32); // This will grow as needed.
SetOfBrokenHints.clear();		SetOfBrokenHints.clear();

allocatePhysRegs();		allocatePhysRegs();
tryHintsRecoloring();		tryHintsRecoloring();

if (VerifyEnabled)		if (VerifyEnabled)
MF->verify(this, "Before post optimization");		MF->verify(this, "Before post optimization");
postOptimization();		postOptimization();
reportStats();		reportStats();

releaseMemory();		releaseMemory();
return true;		return true;
}		}

llvm/lib/CodeGen/RegAllocScore.h

Show All 13 Lines

#ifndef LLVM_CODEGEN_REGALLOCSCORE_H_		#ifndef LLVM_CODEGEN_REGALLOCSCORE_H_
#define LLVM_CODEGEN_REGALLOCSCORE_H_		#define LLVM_CODEGEN_REGALLOCSCORE_H_

#include "llvm/ADT/STLFunctionalExtras.h"		#include "llvm/ADT/STLFunctionalExtras.h"

namespace llvm {		namespace llvm {

class AAResults;
class MachineBasicBlock;		class MachineBasicBlock;
class MachineBlockFrequencyInfo;		class MachineBlockFrequencyInfo;
class MachineFunction;		class MachineFunction;
class MachineInstr;		class MachineInstr;

/// Regalloc score.		/// Regalloc score.
class RegAllocScore final {		class RegAllocScore final {
double CopyCounts = 0.0;		double CopyCounts = 0.0;
Show All 26 Lines	public:
bool operator!=(const RegAllocScore &Other) const;		bool operator!=(const RegAllocScore &Other) const;
double getScore() const;		double getScore() const;
};		};

/// Calculate a score. When comparing 2 scores for the same function but		/// Calculate a score. When comparing 2 scores for the same function but
/// different policies, the better policy would have a smaller score.		/// different policies, the better policy would have a smaller score.
/// The implementation is the overload below (which is also easily unittestable)		/// The implementation is the overload below (which is also easily unittestable)
RegAllocScore calculateRegAllocScore(const MachineFunction &MF,		RegAllocScore calculateRegAllocScore(const MachineFunction &MF,
const MachineBlockFrequencyInfo &MBFI,		const MachineBlockFrequencyInfo &MBFI);
AAResults &AAResults);

/// Implementation of the above, which is also more easily unittestable.		/// Implementation of the above, which is also more easily unittestable.
RegAllocScore calculateRegAllocScore(		RegAllocScore calculateRegAllocScore(
const MachineFunction &MF,		const MachineFunction &MF,
llvm::function_ref<double(const MachineBasicBlock &)> GetBBFreq,		llvm::function_ref<double(const MachineBasicBlock &)> GetBBFreq,
llvm::function_ref<bool(const MachineInstr &)> IsTriviallyRematerializable);		llvm::function_ref<bool(const MachineInstr &)> IsTriviallyRematerializable);
} // end namespace llvm		} // end namespace llvm

#endif // LLVM_CODEGEN_REGALLOCSCORE_H_		#endif // LLVM_CODEGEN_REGALLOCSCORE_H_

llvm/lib/CodeGen/RegAllocScore.cpp

Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	double RegAllocScore::getScore() const {
Ret += CheapRematWeight * cheapRematCounts();		Ret += CheapRematWeight * cheapRematCounts();
Ret += ExpensiveRematWeight * expensiveRematCounts();		Ret += ExpensiveRematWeight * expensiveRematCounts();

return Ret;		return Ret;
}		}

RegAllocScore		RegAllocScore
llvm::calculateRegAllocScore(const MachineFunction &MF,		llvm::calculateRegAllocScore(const MachineFunction &MF,
const MachineBlockFrequencyInfo &MBFI,		const MachineBlockFrequencyInfo &MBFI) {
AAResults &AAResults) {
return calculateRegAllocScore(		return calculateRegAllocScore(
MF,		MF,
[&](const MachineBasicBlock &MBB) {		[&](const MachineBasicBlock &MBB) {
return MBFI.getBlockFreqRelativeToEntryBlock(&MBB);		return MBFI.getBlockFreqRelativeToEntryBlock(&MBB);
},		},
[&](const MachineInstr &MI) {		[&](const MachineInstr &MI) {
return MF.getSubtarget().getInstrInfo()->isTriviallyReMaterializable(		return MF.getSubtarget().getInstrInfo()->isTriviallyReMaterializable(
MI, &AAResults);		MI);
});		});
}		}

RegAllocScore llvm::calculateRegAllocScore(		RegAllocScore llvm::calculateRegAllocScore(
const MachineFunction &MF,		const MachineFunction &MF,
llvm::function_ref<double(const MachineBasicBlock &)> GetBBFreq,		llvm::function_ref<double(const MachineBasicBlock &)> GetBBFreq,
llvm::function_ref<bool(const MachineInstr &)>		llvm::function_ref<bool(const MachineInstr &)>
IsTriviallyRematerializable) {		IsTriviallyRematerializable) {
Show All 30 Lines

llvm/lib/CodeGen/RegisterCoalescer.cpp

Show First 20 Lines • Show All 1,300 Lines • ▼ Show 20 Lines	bool RegisterCoalescer::reMaterializeTrivialDef(const CoalescerPair &CP,
if (!DefMI)		if (!DefMI)
return false;		return false;
if (DefMI->isCopyLike()) {		if (DefMI->isCopyLike()) {
IsDefCopy = true;		IsDefCopy = true;
return false;		return false;
}		}
if (!TII->isAsCheapAsAMove(*DefMI))		if (!TII->isAsCheapAsAMove(*DefMI))
return false;		return false;
if (!TII->isTriviallyReMaterializable(*DefMI, AA))		if (!TII->isTriviallyReMaterializable(*DefMI))
return false;		return false;
if (!definesFullReg(*DefMI, SrcReg))		if (!definesFullReg(*DefMI, SrcReg))
return false;		return false;
bool SawStore = false;		bool SawStore = false;
if (!DefMI->isSafeToMove(AA, SawStore))		if (!DefMI->isSafeToMove(AA, SawStore))
return false;		return false;
const MCInstrDesc &MCID = DefMI->getDesc();		const MCInstrDesc &MCID = DefMI->getDesc();
if (MCID.getNumDefs() != 1)		if (MCID.getNumDefs() != 1)
▲ Show 20 Lines • Show All 2,898 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ScheduleDAGInstrs.cpp

Show First 20 Lines • Show All 524 Lines • ▼ Show 20 Lines	if (V2SU.SU == SU)
continue;		continue;

V2SU.SU->addPred(SDep(SU, SDep::Anti, Reg));		V2SU.SU->addPred(SDep(SU, SDep::Anti, Reg));
}		}
}		}

/// Returns true if MI is an instruction we are unable to reason about		/// Returns true if MI is an instruction we are unable to reason about
/// (like a call or something with unmodeled side effects).		/// (like a call or something with unmodeled side effects).
static inline bool isGlobalMemoryObject(AAResults AA, MachineInstr MI) {		static inline bool isGlobalMemoryObject(MachineInstr *MI) {
return MI->isCall() \|\| MI->hasUnmodeledSideEffects() \|\|		return MI->isCall() \|\| MI->hasUnmodeledSideEffects() \|\|
(MI->hasOrderedMemoryRef() && !MI->isDereferenceableInvariantLoad(AA));		(MI->hasOrderedMemoryRef() && !MI->isDereferenceableInvariantLoad());
}		}

void ScheduleDAGInstrs::addChainDependency (SUnit SUa, SUnit SUb,		void ScheduleDAGInstrs::addChainDependency (SUnit SUa, SUnit SUb,
unsigned Latency) {		unsigned Latency) {
if (SUa->getInstr()->mayAlias(AAForDep, *SUb->getInstr(), UseTBAA)) {		if (SUa->getInstr()->mayAlias(AAForDep, *SUb->getInstr(), UseTBAA)) {
SDep Dep(SUa, SDep::MayAliasMem);		SDep Dep(SUa, SDep::MayAliasMem);
Dep.setLatency(Latency);		Dep.setLatency(Latency);
SUb->addPred(Dep);		SUb->addPred(Dep);
▲ Show 20 Lines • Show All 331 Lines • ▼ Show 20 Lines	if (SU->NumSuccs == 0 && SU->Latency > 1 && (HasVRegDef \|\| MI.mayLoad())) {
ExitSU.addPred(Dep);		ExitSU.addPred(Dep);
}		}

// Add memory dependencies (Note: isStoreToStackSlot and		// Add memory dependencies (Note: isStoreToStackSlot and
// isLoadFromStackSLot are not usable after stack slots are lowered to		// isLoadFromStackSLot are not usable after stack slots are lowered to
// actual addresses).		// actual addresses).

// This is a barrier event that acts as a pivotal node in the DAG.		// This is a barrier event that acts as a pivotal node in the DAG.
if (isGlobalMemoryObject(AA, &MI)) {		if (isGlobalMemoryObject(&MI)) {

// Become the barrier chain.		// Become the barrier chain.
if (BarrierChain)		if (BarrierChain)
BarrierChain->addPredBarrier(SU);		BarrierChain->addPredBarrier(SU);
BarrierChain = SU;		BarrierChain = SU;

LLVM_DEBUG(dbgs() << "Global memory object and new barrier chain: SU("		LLVM_DEBUG(dbgs() << "Global memory object and new barrier chain: SU("
<< BarrierChain->NodeNum << ").\n";);		<< BarrierChain->NodeNum << ").\n";);
Show All 20 Lines	if (MI.mayRaiseFPException()) {
LLVM_DEBUG(dbgs() << "Reducing FPExceptions map.\n";);		LLVM_DEBUG(dbgs() << "Reducing FPExceptions map.\n";);
Value2SUsMap empty;		Value2SUsMap empty;
reduceHugeMemNodeMaps(FPExceptions, empty, getReductionSize());		reduceHugeMemNodeMaps(FPExceptions, empty, getReductionSize());
}		}
}		}

// If it's not a store or a variant load, we're done.		// If it's not a store or a variant load, we're done.
if (!MI.mayStore() &&		if (!MI.mayStore() &&
!(MI.mayLoad() && !MI.isDereferenceableInvariantLoad(AA)))		!(MI.mayLoad() && !MI.isDereferenceableInvariantLoad()))
continue;		continue;

// Always add dependecy edge to BarrierChain if present.		// Always add dependecy edge to BarrierChain if present.
if (BarrierChain)		if (BarrierChain)
BarrierChain->addPredBarrier(SU);		BarrierChain->addPredBarrier(SU);

// Find the underlying objects for MI. The Objs vector is either		// Find the underlying objects for MI. The Objs vector is either
// empty, or filled with the Values of memory locations which this		// empty, or filled with the Values of memory locations which this
▲ Show 20 Lines • Show All 592 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 18 Lines
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/FoldingSet.h"		#include "llvm/ADT/FoldingSet.h"
#include "llvm/ADT/None.h"		#include "llvm/ADT/None.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/MemoryLocation.h"		#include "llvm/Analysis/MemoryLocation.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/Analysis.h"		#include "llvm/CodeGen/Analysis.h"
#include "llvm/CodeGen/FunctionLoweringInfo.h"		#include "llvm/CodeGen/FunctionLoweringInfo.h"
#include "llvm/CodeGen/ISDOpcodes.h"		#include "llvm/CodeGen/ISDOpcodes.h"
#include "llvm/CodeGen/MachineBasicBlock.h"		#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineConstantPool.h"		#include "llvm/CodeGen/MachineConstantPool.h"
#include "llvm/CodeGen/MachineFrameInfo.h"		#include "llvm/CodeGen/MachineFrameInfo.h"
▲ Show 20 Lines • Show All 6,648 Lines • ▼ Show 20 Lines
}		}

static SDValue getMemcpyLoadsAndStores(SelectionDAG &DAG, const SDLoc &dl,		static SDValue getMemcpyLoadsAndStores(SelectionDAG &DAG, const SDLoc &dl,
SDValue Chain, SDValue Dst, SDValue Src,		SDValue Chain, SDValue Dst, SDValue Src,
uint64_t Size, Align Alignment,		uint64_t Size, Align Alignment,
bool isVol, bool AlwaysInline,		bool isVol, bool AlwaysInline,
MachinePointerInfo DstPtrInfo,		MachinePointerInfo DstPtrInfo,
MachinePointerInfo SrcPtrInfo,		MachinePointerInfo SrcPtrInfo,
const AAMDNodes &AAInfo) {		const AAMDNodes &AAInfo, AAResults *AA) {
// Turn a memcpy of undef to nop.		// Turn a memcpy of undef to nop.
// FIXME: We need to honor volatile even is Src is undef.		// FIXME: We need to honor volatile even is Src is undef.
if (Src.isUndef())		if (Src.isUndef())
return Chain;		return Chain;

// Expand memcpy to a series of load and store ops if the size operand falls		// Expand memcpy to a series of load and store ops if the size operand falls
// below a certain threshold.		// below a certain threshold.
// TODO: In the AlwaysInline case, if the size is big then generate a loop		// TODO: In the AlwaysInline case, if the size is big then generate a loop
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	if (NewAlign > Alignment) {
Alignment = NewAlign;		Alignment = NewAlign;
}		}
}		}

// Prepare AAInfo for loads/stores after lowering this memcpy.		// Prepare AAInfo for loads/stores after lowering this memcpy.
AAMDNodes NewAAInfo = AAInfo;		AAMDNodes NewAAInfo = AAInfo;
NewAAInfo.TBAA = NewAAInfo.TBAAStruct = nullptr;		NewAAInfo.TBAA = NewAAInfo.TBAAStruct = nullptr;

		const Value SrcVal = SrcPtrInfo.V.dyn_cast<const Value >();
		bool isConstant =
		AA && SrcVal &&
		AA->pointsToConstantMemory(MemoryLocation(SrcVal, Size, AAInfo));

MachineMemOperand::Flags MMOFlags =		MachineMemOperand::Flags MMOFlags =
isVol ? MachineMemOperand::MOVolatile : MachineMemOperand::MONone;		isVol ? MachineMemOperand::MOVolatile : MachineMemOperand::MONone;
SmallVector<SDValue, 16> OutLoadChains;		SmallVector<SDValue, 16> OutLoadChains;
SmallVector<SDValue, 16> OutStoreChains;		SmallVector<SDValue, 16> OutStoreChains;
SmallVector<SDValue, 32> OutChains;		SmallVector<SDValue, 32> OutChains;
unsigned NumMemOps = MemOps.size();		unsigned NumMemOps = MemOps.size();
uint64_t SrcOff = 0, DstOff = 0;		uint64_t SrcOff = 0, DstOff = 0;
for (unsigned i = 0; i != NumMemOps; ++i) {		for (unsigned i = 0; i != NumMemOps; ++i) {
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	if (!Store.getNode()) {
EVT NVT = TLI.getTypeToTransformTo(C, VT);		EVT NVT = TLI.getTypeToTransformTo(C, VT);
assert(NVT.bitsGE(VT));		assert(NVT.bitsGE(VT));

bool isDereferenceable =		bool isDereferenceable =
SrcPtrInfo.getWithOffset(SrcOff).isDereferenceable(VTSize, C, DL);		SrcPtrInfo.getWithOffset(SrcOff).isDereferenceable(VTSize, C, DL);
MachineMemOperand::Flags SrcMMOFlags = MMOFlags;		MachineMemOperand::Flags SrcMMOFlags = MMOFlags;
if (isDereferenceable)		if (isDereferenceable)
SrcMMOFlags \|= MachineMemOperand::MODereferenceable;		SrcMMOFlags \|= MachineMemOperand::MODereferenceable;
		if (isConstant)
		SrcMMOFlags \|= MachineMemOperand::MOInvariant;

Value = DAG.getExtLoad(		Value = DAG.getExtLoad(
ISD::EXTLOAD, dl, NVT, Chain,		ISD::EXTLOAD, dl, NVT, Chain,
DAG.getMemBasePlusOffset(Src, TypeSize::Fixed(SrcOff), dl),		DAG.getMemBasePlusOffset(Src, TypeSize::Fixed(SrcOff), dl),
SrcPtrInfo.getWithOffset(SrcOff), VT,		SrcPtrInfo.getWithOffset(SrcOff), VT,
commonAlignment(*SrcAlign, SrcOff), SrcMMOFlags, NewAAInfo);		commonAlignment(*SrcAlign, SrcOff), SrcMMOFlags, NewAAInfo);
OutLoadChains.push_back(Value.getValue(1));		OutLoadChains.push_back(Value.getValue(1));

▲ Show 20 Lines • Show All 272 Lines • ▼ Show 20 Lines	static void checkAddrSpaceIsValidForLibcall(const TargetLowering *TLI,
}		}
}		}

SDValue SelectionDAG::getMemcpy(SDValue Chain, const SDLoc &dl, SDValue Dst,		SDValue SelectionDAG::getMemcpy(SDValue Chain, const SDLoc &dl, SDValue Dst,
SDValue Src, SDValue Size, Align Alignment,		SDValue Src, SDValue Size, Align Alignment,
bool isVol, bool AlwaysInline, bool isTailCall,		bool isVol, bool AlwaysInline, bool isTailCall,
MachinePointerInfo DstPtrInfo,		MachinePointerInfo DstPtrInfo,
MachinePointerInfo SrcPtrInfo,		MachinePointerInfo SrcPtrInfo,
const AAMDNodes &AAInfo) {		const AAMDNodes &AAInfo, AAResults *AA) {
// Check to see if we should lower the memcpy to loads and stores first.		// Check to see if we should lower the memcpy to loads and stores first.
// For cases within the target-specified limits, this is the best choice.		// For cases within the target-specified limits, this is the best choice.
ConstantSDNode *ConstantSize = dyn_cast<ConstantSDNode>(Size);		ConstantSDNode *ConstantSize = dyn_cast<ConstantSDNode>(Size);
if (ConstantSize) {		if (ConstantSize) {
// Memcpy with size zero? Just return the original chain.		// Memcpy with size zero? Just return the original chain.
if (ConstantSize->isZero())		if (ConstantSize->isZero())
return Chain;		return Chain;

SDValue Result = getMemcpyLoadsAndStores(		SDValue Result = getMemcpyLoadsAndStores(
*this, dl, Chain, Dst, Src, ConstantSize->getZExtValue(), Alignment,		*this, dl, Chain, Dst, Src, ConstantSize->getZExtValue(), Alignment,
isVol, false, DstPtrInfo, SrcPtrInfo, AAInfo);		isVol, false, DstPtrInfo, SrcPtrInfo, AAInfo, AA);
if (Result.getNode())		if (Result.getNode())
return Result;		return Result;
}		}

// Then check to see if we should lower the memcpy with target-specific		// Then check to see if we should lower the memcpy with target-specific
// code. If the target chooses to do this, this is the next best.		// code. If the target chooses to do this, this is the next best.
if (TSI) {		if (TSI) {
SDValue Result = TSI->EmitTargetCodeForMemcpy(		SDValue Result = TSI->EmitTargetCodeForMemcpy(
*this, dl, Chain, Dst, Src, Size, Alignment, isVol, AlwaysInline,		*this, dl, Chain, Dst, Src, Size, Alignment, isVol, AlwaysInline,
DstPtrInfo, SrcPtrInfo);		DstPtrInfo, SrcPtrInfo);
if (Result.getNode())		if (Result.getNode())
return Result;		return Result;
}		}

// If we really need inline code and the target declined to provide it,		// If we really need inline code and the target declined to provide it,
// use a (potentially long) sequence of loads and stores.		// use a (potentially long) sequence of loads and stores.
if (AlwaysInline) {		if (AlwaysInline) {
assert(ConstantSize && "AlwaysInline requires a constant size!");		assert(ConstantSize && "AlwaysInline requires a constant size!");
return getMemcpyLoadsAndStores(*this, dl, Chain, Dst, Src,		return getMemcpyLoadsAndStores(
ConstantSize->getZExtValue(), Alignment,		*this, dl, Chain, Dst, Src, ConstantSize->getZExtValue(), Alignment,
isVol, true, DstPtrInfo, SrcPtrInfo, AAInfo);		isVol, true, DstPtrInfo, SrcPtrInfo, AAInfo, AA);
}		}

checkAddrSpaceIsValidForLibcall(TLI, DstPtrInfo.getAddrSpace());		checkAddrSpaceIsValidForLibcall(TLI, DstPtrInfo.getAddrSpace());
checkAddrSpaceIsValidForLibcall(TLI, SrcPtrInfo.getAddrSpace());		checkAddrSpaceIsValidForLibcall(TLI, SrcPtrInfo.getAddrSpace());

// FIXME: If the memcpy is volatile (isVol), lowering it to a plain libc		// FIXME: If the memcpy is volatile (isVol), lowering it to a plain libc
// memcpy is not guaranteed to be safe. libc memcpys aren't required to		// memcpy is not guaranteed to be safe. libc memcpys aren't required to
// respect volatile, so they may do things like read or write memory		// respect volatile, so they may do things like read or write memory
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	SDValue SelectionDAG::getAtomicMemcpy(SDValue Chain, const SDLoc &dl,
return CallResult.second;		return CallResult.second;
}		}

SDValue SelectionDAG::getMemmove(SDValue Chain, const SDLoc &dl, SDValue Dst,		SDValue SelectionDAG::getMemmove(SDValue Chain, const SDLoc &dl, SDValue Dst,
SDValue Src, SDValue Size, Align Alignment,		SDValue Src, SDValue Size, Align Alignment,
bool isVol, bool isTailCall,		bool isVol, bool isTailCall,
MachinePointerInfo DstPtrInfo,		MachinePointerInfo DstPtrInfo,
MachinePointerInfo SrcPtrInfo,		MachinePointerInfo SrcPtrInfo,
const AAMDNodes &AAInfo) {		const AAMDNodes &AAInfo, AAResults *AA) {
// Check to see if we should lower the memmove to loads and stores first.		// Check to see if we should lower the memmove to loads and stores first.
// For cases within the target-specified limits, this is the best choice.		// For cases within the target-specified limits, this is the best choice.
ConstantSDNode *ConstantSize = dyn_cast<ConstantSDNode>(Size);		ConstantSDNode *ConstantSize = dyn_cast<ConstantSDNode>(Size);
if (ConstantSize) {		if (ConstantSize) {
// Memmove with size zero? Just return the original chain.		// Memmove with size zero? Just return the original chain.
if (ConstantSize->isZero())		if (ConstantSize->isZero())
return Chain;		return Chain;

▲ Show 20 Lines • Show All 4,535 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,077 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitLoad(const LoadInst &I) {
unsigned NumValues = ValueVTs.size();		unsigned NumValues = ValueVTs.size();
if (NumValues == 0)		if (NumValues == 0)
return;		return;

Align Alignment = I.getAlign();		Align Alignment = I.getAlign();
AAMDNodes AAInfo = I.getAAMetadata();		AAMDNodes AAInfo = I.getAAMetadata();
const MDNode *Ranges = I.getMetadata(LLVMContext::MD_range);		const MDNode *Ranges = I.getMetadata(LLVMContext::MD_range);
bool isVolatile = I.isVolatile();		bool isVolatile = I.isVolatile();
		MachineMemOperand::Flags MMOFlags =
		TLI.getLoadMemOperandFlags(I, DAG.getDataLayout());

SDValue Root;		SDValue Root;
bool ConstantMemory = false;		bool ConstantMemory = false;
if (isVolatile)		if (isVolatile)
// Serialize volatile loads with other side effects.		// Serialize volatile loads with other side effects.
Root = getRoot();		Root = getRoot();
else if (NumValues > MaxParallelChains)		else if (NumValues > MaxParallelChains)
Root = getMemoryRoot();		Root = getMemoryRoot();
else if (AA &&		else if (AA &&
AA->pointsToConstantMemory(MemoryLocation(		AA->pointsToConstantMemory(MemoryLocation(
SV,		SV,
LocationSize::precise(DAG.getDataLayout().getTypeStoreSize(Ty)),		LocationSize::precise(DAG.getDataLayout().getTypeStoreSize(Ty)),
AAInfo))) {		AAInfo))) {
// Do not serialize (non-volatile) loads of constant memory with anything.		// Do not serialize (non-volatile) loads of constant memory with anything.
Root = DAG.getEntryNode();		Root = DAG.getEntryNode();
ConstantMemory = true;		ConstantMemory = true;
		MMOFlags \|= MachineMemOperand::MOInvariant;

		// FIXME: pointsToConstantMemory probably does not imply dereferenceable,
		// but the previous usage implied it did. Probably should check
		// isDereferenceableAndAlignedPointer.
		MMOFlags \|= MachineMemOperand::MODereferenceable;
} else {		} else {
// Do not serialize non-volatile loads against each other.		// Do not serialize non-volatile loads against each other.
Root = DAG.getRoot();		Root = DAG.getRoot();
}		}

SDLoc dl = getCurSDLoc();		SDLoc dl = getCurSDLoc();

if (isVolatile)		if (isVolatile)
Root = TLI.prepareVolatileOrAtomicLoad(Root, dl, DAG);		Root = TLI.prepareVolatileOrAtomicLoad(Root, dl, DAG);

// An aggregate load cannot wrap around the address space, so offsets to its		// An aggregate load cannot wrap around the address space, so offsets to its
// parts don't wrap either.		// parts don't wrap either.
SDNodeFlags Flags;		SDNodeFlags Flags;
Flags.setNoUnsignedWrap(true);		Flags.setNoUnsignedWrap(true);

SmallVector<SDValue, 4> Values(NumValues);		SmallVector<SDValue, 4> Values(NumValues);
SmallVector<SDValue, 4> Chains(std::min(MaxParallelChains, NumValues));		SmallVector<SDValue, 4> Chains(std::min(MaxParallelChains, NumValues));
EVT PtrVT = Ptr.getValueType();		EVT PtrVT = Ptr.getValueType();

MachineMemOperand::Flags MMOFlags
= TLI.getLoadMemOperandFlags(I, DAG.getDataLayout());

unsigned ChainI = 0;		unsigned ChainI = 0;
for (unsigned i = 0; i != NumValues; ++i, ++ChainI) {		for (unsigned i = 0; i != NumValues; ++i, ++ChainI) {
// Serializing loads here may result in excessive register pressure, and		// Serializing loads here may result in excessive register pressure, and
// TokenFactor places arbitrary choke points on the scheduler. SD scheduling		// TokenFactor places arbitrary choke points on the scheduler. SD scheduling
// could recover a bit by hoisting nodes upward in the chain by recognizing		// could recover a bit by hoisting nodes upward in the chain by recognizing
// they are side-effect free or do not alias. The optimizer should really		// they are side-effect free or do not alias. The optimizer should really
// avoid this case by converting large object/array copies to llvm.memcpy		// avoid this case by converting large object/array copies to llvm.memcpy
// (MaxParallelChains should always remain as failsafe).		// (MaxParallelChains should always remain as failsafe).
▲ Show 20 Lines • Show All 1,728 Lines • ▼ Show 20 Lines	case Intrinsic::memcpy: {
Align DstAlign = MCI.getDestAlign().valueOrOne();		Align DstAlign = MCI.getDestAlign().valueOrOne();
Align SrcAlign = MCI.getSourceAlign().valueOrOne();		Align SrcAlign = MCI.getSourceAlign().valueOrOne();
Align Alignment = commonAlignment(DstAlign, SrcAlign);		Align Alignment = commonAlignment(DstAlign, SrcAlign);
bool isVol = MCI.isVolatile();		bool isVol = MCI.isVolatile();
bool isTC = I.isTailCall() && isInTailCallPosition(I, DAG.getTarget());		bool isTC = I.isTailCall() && isInTailCallPosition(I, DAG.getTarget());
// FIXME: Support passing different dest/src alignments to the memcpy DAG		// FIXME: Support passing different dest/src alignments to the memcpy DAG
// node.		// node.
SDValue Root = isVol ? getRoot() : getMemoryRoot();		SDValue Root = isVol ? getRoot() : getMemoryRoot();
SDValue MC = DAG.getMemcpy(Root, sdl, Op1, Op2, Op3, Alignment, isVol,		SDValue MC = DAG.getMemcpy(
/* AlwaysInline */ false, isTC,		Root, sdl, Op1, Op2, Op3, Alignment, isVol,
MachinePointerInfo(I.getArgOperand(0)),		/* AlwaysInline */ false, isTC, MachinePointerInfo(I.getArgOperand(0)),
MachinePointerInfo(I.getArgOperand(1)),		MachinePointerInfo(I.getArgOperand(1)), I.getAAMetadata(), AA);
I.getAAMetadata());
updateDAGForMaybeTailCall(MC);		updateDAGForMaybeTailCall(MC);
return;		return;
}		}
case Intrinsic::memcpy_inline: {		case Intrinsic::memcpy_inline: {
const auto &MCI = cast<MemCpyInlineInst>(I);		const auto &MCI = cast<MemCpyInlineInst>(I);
SDValue Dst = getValue(I.getArgOperand(0));		SDValue Dst = getValue(I.getArgOperand(0));
SDValue Src = getValue(I.getArgOperand(1));		SDValue Src = getValue(I.getArgOperand(1));
SDValue Size = getValue(I.getArgOperand(2));		SDValue Size = getValue(I.getArgOperand(2));
assert(isa<ConstantSDNode>(Size) && "memcpy_inline needs constant size");		assert(isa<ConstantSDNode>(Size) && "memcpy_inline needs constant size");
// @llvm.memcpy.inline defines 0 and 1 to both mean no alignment.		// @llvm.memcpy.inline defines 0 and 1 to both mean no alignment.
Align DstAlign = MCI.getDestAlign().valueOrOne();		Align DstAlign = MCI.getDestAlign().valueOrOne();
Align SrcAlign = MCI.getSourceAlign().valueOrOne();		Align SrcAlign = MCI.getSourceAlign().valueOrOne();
Align Alignment = commonAlignment(DstAlign, SrcAlign);		Align Alignment = commonAlignment(DstAlign, SrcAlign);
bool isVol = MCI.isVolatile();		bool isVol = MCI.isVolatile();
bool isTC = I.isTailCall() && isInTailCallPosition(I, DAG.getTarget());		bool isTC = I.isTailCall() && isInTailCallPosition(I, DAG.getTarget());
// FIXME: Support passing different dest/src alignments to the memcpy DAG		// FIXME: Support passing different dest/src alignments to the memcpy DAG
// node.		// node.
SDValue MC = DAG.getMemcpy(getRoot(), sdl, Dst, Src, Size, Alignment, isVol,		SDValue MC = DAG.getMemcpy(
/* AlwaysInline */ true, isTC,		getRoot(), sdl, Dst, Src, Size, Alignment, isVol,
MachinePointerInfo(I.getArgOperand(0)),		/* AlwaysInline */ true, isTC, MachinePointerInfo(I.getArgOperand(0)),
MachinePointerInfo(I.getArgOperand(1)),		MachinePointerInfo(I.getArgOperand(1)), I.getAAMetadata(), AA);
I.getAAMetadata());
updateDAGForMaybeTailCall(MC);		updateDAGForMaybeTailCall(MC);
return;		return;
}		}
case Intrinsic::memset: {		case Intrinsic::memset: {
const auto &MSI = cast<MemSetInst>(I);		const auto &MSI = cast<MemSetInst>(I);
SDValue Op1 = getValue(I.getArgOperand(0));		SDValue Op1 = getValue(I.getArgOperand(0));
SDValue Op2 = getValue(I.getArgOperand(1));		SDValue Op2 = getValue(I.getArgOperand(1));
SDValue Op3 = getValue(I.getArgOperand(2));		SDValue Op3 = getValue(I.getArgOperand(2));
Show All 38 Lines	case Intrinsic::memmove: {
bool isVol = MMI.isVolatile();		bool isVol = MMI.isVolatile();
bool isTC = I.isTailCall() && isInTailCallPosition(I, DAG.getTarget());		bool isTC = I.isTailCall() && isInTailCallPosition(I, DAG.getTarget());
// FIXME: Support passing different dest/src alignments to the memmove DAG		// FIXME: Support passing different dest/src alignments to the memmove DAG
// node.		// node.
SDValue Root = isVol ? getRoot() : getMemoryRoot();		SDValue Root = isVol ? getRoot() : getMemoryRoot();
SDValue MM = DAG.getMemmove(Root, sdl, Op1, Op2, Op3, Alignment, isVol,		SDValue MM = DAG.getMemmove(Root, sdl, Op1, Op2, Op3, Alignment, isVol,
isTC, MachinePointerInfo(I.getArgOperand(0)),		isTC, MachinePointerInfo(I.getArgOperand(0)),
MachinePointerInfo(I.getArgOperand(1)),		MachinePointerInfo(I.getArgOperand(1)),
I.getAAMetadata());		I.getAAMetadata(), AA);
updateDAGForMaybeTailCall(MM);		updateDAGForMaybeTailCall(MM);
return;		return;
}		}
case Intrinsic::memcpy_element_unordered_atomic: {		case Intrinsic::memcpy_element_unordered_atomic: {
const AtomicMemCpyInst &MI = cast<AtomicMemCpyInst>(I);		const AtomicMemCpyInst &MI = cast<AtomicMemCpyInst>(I);
SDValue Dst = getValue(MI.getRawDest());		SDValue Dst = getValue(MI.getRawDest());
SDValue Src = getValue(MI.getRawSource());		SDValue Src = getValue(MI.getRawSource());
SDValue Length = getValue(MI.getLength());		SDValue Length = getValue(MI.getLength());
▲ Show 20 Lines • Show All 5,464 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SplitKit.h

Show First 20 Lines • Show All 251 Lines • ▼ Show 20 Lines
/// - Mark the places where the new interval is entered using enterIntv*		/// - Mark the places where the new interval is entered using enterIntv*
/// - Mark the ranges where the new interval is used with useIntv*		/// - Mark the ranges where the new interval is used with useIntv*
/// - Mark the places where the interval is exited with exitIntv*.		/// - Mark the places where the interval is exited with exitIntv*.
/// - Finish the current interval with closeIntv and repeat from 2.		/// - Finish the current interval with closeIntv and repeat from 2.
/// - Rewrite instructions with finish().		/// - Rewrite instructions with finish().
///		///
class LLVM_LIBRARY_VISIBILITY SplitEditor {		class LLVM_LIBRARY_VISIBILITY SplitEditor {
SplitAnalysis &SA;		SplitAnalysis &SA;
AAResults &AA;
LiveIntervals &LIS;		LiveIntervals &LIS;
VirtRegMap &VRM;		VirtRegMap &VRM;
MachineRegisterInfo &MRI;		MachineRegisterInfo &MRI;
MachineDominatorTree &MDT;		MachineDominatorTree &MDT;
const TargetInstrInfo &TII;		const TargetInstrInfo &TII;
const TargetRegisterInfo &TRI;		const TargetRegisterInfo &TRI;
const MachineBlockFrequencyInfo &MBFI;		const MachineBlockFrequencyInfo &MBFI;
VirtRegAuxInfo &VRAI;		VirtRegAuxInfo &VRAI;
▲ Show 20 Lines • Show All 162 Lines • ▼ Show 20 Lines	private:

SlotIndex buildSingleSubRegCopy(Register FromReg, Register ToReg,		SlotIndex buildSingleSubRegCopy(Register FromReg, Register ToReg,
MachineBasicBlock &MB, MachineBasicBlock::iterator InsertBefore,		MachineBasicBlock &MB, MachineBasicBlock::iterator InsertBefore,
unsigned SubIdx, LiveInterval &DestLI, bool Late, SlotIndex Def);		unsigned SubIdx, LiveInterval &DestLI, bool Late, SlotIndex Def);

public:		public:
/// Create a new SplitEditor for editing the LiveInterval analyzed by SA.		/// Create a new SplitEditor for editing the LiveInterval analyzed by SA.
/// Newly created intervals will be appended to newIntervals.		/// Newly created intervals will be appended to newIntervals.
SplitEditor(SplitAnalysis &SA, AAResults &AA, LiveIntervals &LIS,		SplitEditor(SplitAnalysis &SA, LiveIntervals &LIS, VirtRegMap &VRM,
VirtRegMap &VRM, MachineDominatorTree &MDT,		MachineDominatorTree &MDT, MachineBlockFrequencyInfo &MBFI,
MachineBlockFrequencyInfo &MBFI, VirtRegAuxInfo &VRAI);		VirtRegAuxInfo &VRAI);

/// reset - Prepare for a new split.		/// reset - Prepare for a new split.
void reset(LiveRangeEdit&, ComplementSpillMode = SM_Partition);		void reset(LiveRangeEdit&, ComplementSpillMode = SM_Partition);

/// Create a new virtual register and live interval.		/// Create a new virtual register and live interval.
/// Return the interval index, starting from 1. Interval index 0 is the		/// Return the interval index, starting from 1. Interval index 0 is the
/// implicit complement interval.		/// implicit complement interval.
unsigned openIntv();		unsigned openIntv();
▲ Show 20 Lines • Show All 110 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SplitKit.cpp

Show First 20 Lines • Show All 341 Lines • ▼ Show 20 Lines	void SplitAnalysis::analyze(const LiveInterval *li) {
analyzeUses();		analyzeUses();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Split Editor		// Split Editor
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Create a new SplitEditor for editing the LiveInterval analyzed by SA.		/// Create a new SplitEditor for editing the LiveInterval analyzed by SA.
SplitEditor::SplitEditor(SplitAnalysis &SA, AliasAnalysis &AA,		SplitEditor::SplitEditor(SplitAnalysis &SA, LiveIntervals &LIS, VirtRegMap &VRM,
LiveIntervals &LIS, VirtRegMap &VRM,
MachineDominatorTree &MDT,		MachineDominatorTree &MDT,
MachineBlockFrequencyInfo &MBFI, VirtRegAuxInfo &VRAI)		MachineBlockFrequencyInfo &MBFI, VirtRegAuxInfo &VRAI)
: SA(SA), AA(AA), LIS(LIS), VRM(VRM),		: SA(SA), LIS(LIS), VRM(VRM), MRI(VRM.getMachineFunction().getRegInfo()),
MRI(VRM.getMachineFunction().getRegInfo()), MDT(MDT),		MDT(MDT), TII(*VRM.getMachineFunction().getSubtarget().getInstrInfo()),
TII(*VRM.getMachineFunction().getSubtarget().getInstrInfo()),
TRI(*VRM.getMachineFunction().getSubtarget().getRegisterInfo()),		TRI(*VRM.getMachineFunction().getSubtarget().getRegisterInfo()),
MBFI(MBFI), VRAI(VRAI), RegAssign(Allocator) {}		MBFI(MBFI), VRAI(VRAI), RegAssign(Allocator) {}

void SplitEditor::reset(LiveRangeEdit &LRE, ComplementSpillMode SM) {		void SplitEditor::reset(LiveRangeEdit &LRE, ComplementSpillMode SM) {
Edit = &LRE;		Edit = &LRE;
SpillMode = SM;		SpillMode = SM;
OpenIdx = 0;		OpenIdx = 0;
RegAssign.clear();		RegAssign.clear();
Values.clear();		Values.clear();

// Reset the LiveIntervalCalc instances needed for this spill mode.		// Reset the LiveIntervalCalc instances needed for this spill mode.
LICalc[0].reset(&VRM.getMachineFunction(), LIS.getSlotIndexes(), &MDT,		LICalc[0].reset(&VRM.getMachineFunction(), LIS.getSlotIndexes(), &MDT,
&LIS.getVNInfoAllocator());		&LIS.getVNInfoAllocator());
if (SpillMode)		if (SpillMode)
LICalc[1].reset(&VRM.getMachineFunction(), LIS.getSlotIndexes(), &MDT,		LICalc[1].reset(&VRM.getMachineFunction(), LIS.getSlotIndexes(), &MDT,
&LIS.getVNInfoAllocator());		&LIS.getVNInfoAllocator());

// We don't need an AliasAnalysis since we will only be performing		Edit->anyRematerializable();
// cheap-as-a-copy remats anyway.
Edit->anyRematerializable(nullptr);
}		}

#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)		#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD void SplitEditor::dump() const {		LLVM_DUMP_METHOD void SplitEditor::dump() const {
if (RegAssign.empty()) {		if (RegAssign.empty()) {
dbgs() << " empty\n";		dbgs() << " empty\n";
return;		return;
}		}
▲ Show 20 Lines • Show All 1,064 Lines • ▼ Show 20 Lines	for (const LiveRange::Segment &S : LI->segments) {
LLVM_DEBUG(dbgs() << "All defs dead: " << *MI);		LLVM_DEBUG(dbgs() << "All defs dead: " << *MI);
Dead.push_back(MI);		Dead.push_back(MI);
}		}
}		}

if (Dead.empty())		if (Dead.empty())
return;		return;

Edit->eliminateDeadDefs(Dead, None, &AA);		Edit->eliminateDeadDefs(Dead, None);
}		}

void SplitEditor::forceRecomputeVNI(const VNInfo &ParentVNI) {		void SplitEditor::forceRecomputeVNI(const VNInfo &ParentVNI) {
// Fast-path for common case.		// Fast-path for common case.
if (!ParentVNI.isPHIDef()) {		if (!ParentVNI.isPHIDef()) {
for (unsigned I = 0, E = Edit->size(); I != E; ++I)		for (unsigned I = 0, E = Edit->size(); I != E; ++I)
forceRecompute(I, ParentVNI);		forceRecompute(I, ParentVNI);
return;		return;
▲ Show 20 Lines • Show All 428 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetInstrInfo.cpp

Show First 20 Lines • Show All 910 Lines • ▼ Show 20 Lines	void TargetInstrInfo::genAlternativeCodeSequence(
}		}

assert(Prev && "Unknown pattern for machine combiner");		assert(Prev && "Unknown pattern for machine combiner");

reassociateOps(Root, *Prev, Pattern, InsInstrs, DelInstrs, InstIdxForVirtReg);		reassociateOps(Root, *Prev, Pattern, InsInstrs, DelInstrs, InstIdxForVirtReg);
}		}

bool TargetInstrInfo::isReallyTriviallyReMaterializableGeneric(		bool TargetInstrInfo::isReallyTriviallyReMaterializableGeneric(
const MachineInstr &MI, AAResults *AA) const {		const MachineInstr &MI) const {
const MachineFunction &MF = *MI.getMF();		const MachineFunction &MF = *MI.getMF();
const MachineRegisterInfo &MRI = MF.getRegInfo();		const MachineRegisterInfo &MRI = MF.getRegInfo();

// Remat clients assume operand 0 is the defined register.		// Remat clients assume operand 0 is the defined register.
if (!MI.getNumOperands() \|\| !MI.getOperand(0).isReg())		if (!MI.getNumOperands() \|\| !MI.getOperand(0).isReg())
return false;		return false;
Register DefReg = MI.getOperand(0).getReg();		Register DefReg = MI.getOperand(0).getReg();

Show All 19 Lines	if (MI.isNotDuplicable() \|\| MI.mayStore() \|\| MI.mayRaiseFPException() \|\|
return false;		return false;

// Don't remat inline asm. We have no idea how expensive it is		// Don't remat inline asm. We have no idea how expensive it is
// even if it's side effect free.		// even if it's side effect free.
if (MI.isInlineAsm())		if (MI.isInlineAsm())
return false;		return false;

// Avoid instructions which load from potentially varying memory.		// Avoid instructions which load from potentially varying memory.
if (MI.mayLoad() && !MI.isDereferenceableInvariantLoad(AA))		if (MI.mayLoad() && !MI.isDereferenceableInvariantLoad())
return false;		return false;

// If any of the registers accessed are non-constant, conservatively assume		// If any of the registers accessed are non-constant, conservatively assume
// the instruction is not rematerializable.		// the instruction is not rematerializable.
for (const MachineOperand &MO : MI.operands()) {		for (const MachineOperand &MO : MI.operands()) {
if (!MO.isReg()) continue;		if (!MO.isReg()) continue;
Register Reg = MO.getReg();		Register Reg = MO.getReg();
if (Reg == 0)		if (Reg == 0)
▲ Show 20 Lines • Show All 470 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/GCNSchedStrategy.h

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	class GCNScheduleDAGMILive final : public ScheduleDAGMILive {

DenseMap<MachineInstr *, GCNRPTracker::LiveRegSet> BBLiveInMap;		DenseMap<MachineInstr *, GCNRPTracker::LiveRegSet> BBLiveInMap;
DenseMap<MachineInstr *, GCNRPTracker::LiveRegSet> getBBLiveInMap() const;		DenseMap<MachineInstr *, GCNRPTracker::LiveRegSet> getBBLiveInMap() const;

// Collect all trivially rematerializable VGPR instructions with a single def		// Collect all trivially rematerializable VGPR instructions with a single def
// and single use outside the defining block into RematerializableInsts.		// and single use outside the defining block into RematerializableInsts.
void collectRematerializableInstructions();		void collectRematerializableInstructions();

bool isTriviallyReMaterializable(const MachineInstr &MI, AAResults *AA);		bool isTriviallyReMaterializable(const MachineInstr &MI);

// TODO: Should also attempt to reduce RP of SGPRs and AGPRs		// TODO: Should also attempt to reduce RP of SGPRs and AGPRs
// Attempt to reduce RP of VGPR by sinking trivially rematerializable		// Attempt to reduce RP of VGPR by sinking trivially rematerializable
// instructions. Returns true if we were able to sink instruction(s).		// instructions. Returns true if we were able to sink instruction(s).
bool sinkTriviallyRematInsts(const GCNSubtarget &ST,		bool sinkTriviallyRematInsts(const GCNSubtarget &ST,
const TargetInstrInfo *TII);		const TargetInstrInfo *TII);

// Return current region pressure.		// Return current region pressure.
Show All 24 Lines

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp

Show First 20 Lines • Show All 727 Lines • ▼ Show 20 Lines	for (unsigned I = 0, E = MRI.getNumVirtRegs(); I != E; ++I) {

// TODO: Handle AGPR and SGPR rematerialization		// TODO: Handle AGPR and SGPR rematerialization
if (!SRI->isVGPRClass(MRI.getRegClass(Reg)) \|\| !MRI.hasOneDef(Reg) \|\|		if (!SRI->isVGPRClass(MRI.getRegClass(Reg)) \|\| !MRI.hasOneDef(Reg) \|\|
!MRI.hasOneNonDBGUse(Reg))		!MRI.hasOneNonDBGUse(Reg))
continue;		continue;

MachineOperand *Op = MRI.getOneDef(Reg);		MachineOperand *Op = MRI.getOneDef(Reg);
MachineInstr *Def = Op->getParent();		MachineInstr *Def = Op->getParent();
if (Op->getSubReg() != 0 \|\| !isTriviallyReMaterializable(*Def, AA))		if (Op->getSubReg() != 0 \|\| !isTriviallyReMaterializable(*Def))
continue;		continue;

MachineInstr UseI = &MRI.use_instr_nodbg_begin(Reg);		MachineInstr UseI = &MRI.use_instr_nodbg_begin(Reg);
if (Def->getParent() == UseI->getParent())		if (Def->getParent() == UseI->getParent())
continue;		continue;

// We are only collecting defs that are defined in another block and are		// We are only collecting defs that are defined in another block and are
// live-through or used inside regions at MinOccupancy. This means that the		// live-through or used inside regions at MinOccupancy. This means that the
▲ Show 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	bool GCNScheduleDAGMILive::sinkTriviallyRematInsts(const GCNSubtarget &ST,

SIMachineFunctionInfo &MFI = *MF.getInfo<SIMachineFunctionInfo>();		SIMachineFunctionInfo &MFI = *MF.getInfo<SIMachineFunctionInfo>();
MFI.increaseOccupancy(MF, ++MinOccupancy);		MFI.increaseOccupancy(MF, ++MinOccupancy);

return true;		return true;
}		}

// Copied from MachineLICM		// Copied from MachineLICM
bool GCNScheduleDAGMILive::isTriviallyReMaterializable(const MachineInstr &MI,		bool GCNScheduleDAGMILive::isTriviallyReMaterializable(const MachineInstr &MI) {
AAResults *AA) {		if (!TII->isTriviallyReMaterializable(MI))
if (!TII->isTriviallyReMaterializable(MI, AA))
return false;		return false;

for (const MachineOperand &MO : MI.operands())		for (const MachineOperand &MO : MI.operands())
if (MO.isReg() && MO.isUse() && MO.getReg().isVirtual())		if (MO.isReg() && MO.isUse() && MO.getReg().isVirtual())
return false;		return false;

return true;		return true;
}		}
▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstrInfo.h

Show First 20 Lines • Show All 178 Lines • ▼ Show 20 Lines	public:
const SIRegisterInfo &getRegisterInfo() const {		const SIRegisterInfo &getRegisterInfo() const {
return RI;		return RI;
}		}

const GCNSubtarget &getSubtarget() const {		const GCNSubtarget &getSubtarget() const {
return ST;		return ST;
}		}

bool isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool isReallyTriviallyReMaterializable(const MachineInstr &MI) const override;
AAResults *AA) const override;

bool isIgnorableUse(const MachineOperand &MO) const override;		bool isIgnorableUse(const MachineOperand &MO) const override;

bool areLoadsFromSameBasePtr(SDNode Load1, SDNode Load2,		bool areLoadsFromSameBasePtr(SDNode Load1, SDNode Load2,
int64_t &Offset1,		int64_t &Offset1,
int64_t &Offset2) const override;		int64_t &Offset2) const override;

bool getMemOperandsWithOffsetWidth(		bool getMemOperandsWithOffsetWidth(
▲ Show 20 Lines • Show All 1,133 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	static bool nodesHaveSameOperandValue(SDNode N0, SDNode N1, unsigned OpName) {
// MachineSDNode's operands, so we need to skip the result operand to get		// MachineSDNode's operands, so we need to skip the result operand to get
// the real index.		// the real index.
--Op0Idx;		--Op0Idx;
--Op1Idx;		--Op1Idx;

return N0->getOperand(Op0Idx) == N1->getOperand(Op1Idx);		return N0->getOperand(Op0Idx) == N1->getOperand(Op1Idx);
}		}

bool SIInstrInfo::isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool SIInstrInfo::isReallyTriviallyReMaterializable(
AAResults *AA) const {		const MachineInstr &MI) const {
if (isVOP1(MI) \|\| isVOP2(MI) \|\| isVOP3(MI) \|\| isSDWA(MI) \|\| isSALU(MI)) {		if (isVOP1(MI) \|\| isVOP2(MI) \|\| isVOP3(MI) \|\| isSDWA(MI) \|\| isSALU(MI)) {
// Normally VALU use of exec would block the rematerialization, but that		// Normally VALU use of exec would block the rematerialization, but that
// is OK in this case to have an implicit exec read as all VALU do.		// is OK in this case to have an implicit exec read as all VALU do.
// We really want all of the generic logic for this except for this.		// We really want all of the generic logic for this except for this.

// Another potential implicit use is mode register. The core logic of		// Another potential implicit use is mode register. The core logic of
// the RA will not attempt rematerialization if mode is set anywhere		// the RA will not attempt rematerialization if mode is set anywhere
// in the function, otherwise it is safe since mode is not changed.		// in the function, otherwise it is safe since mode is not changed.
▲ Show 20 Lines • Show All 8,336 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMBaseInstrInfo.h

Show First 20 Lines • Show All 474 Lines • ▼ Show 20 Lines	private:

void expandMEMCPY(MachineBasicBlock::iterator) const;		void expandMEMCPY(MachineBasicBlock::iterator) const;

/// Identify instructions that can be folded into a MOVCC instruction, and		/// Identify instructions that can be folded into a MOVCC instruction, and
/// return the defining instruction.		/// return the defining instruction.
MachineInstr *canFoldIntoMOVCC(Register Reg, const MachineRegisterInfo &MRI,		MachineInstr *canFoldIntoMOVCC(Register Reg, const MachineRegisterInfo &MRI,
const TargetInstrInfo *TII) const;		const TargetInstrInfo *TII) const;

bool isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool isReallyTriviallyReMaterializable(const MachineInstr &MI) const override;
AAResults *AA) const override;

private:		private:
/// Modeling special VFP / NEON fp MLA / MLS hazards.		/// Modeling special VFP / NEON fp MLA / MLS hazards.

/// MLxEntryMap - Map fp MLA / MLS to the corresponding entry in the internal		/// MLxEntryMap - Map fp MLA / MLS to the corresponding entry in the internal
/// MLx table.		/// MLx table.
DenseMap<unsigned, unsigned> MLxEntryMap;		DenseMap<unsigned, unsigned> MLxEntryMap;

▲ Show 20 Lines • Show All 473 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp

Show First 20 Lines • Show All 6,720 Lines • ▼ Show 20 Lines	MachineBasicBlock::iterator ARMBaseInstrInfo::insertOutlinedCall(
return CallPt;		return CallPt;
}		}

bool ARMBaseInstrInfo::shouldOutlineFromFunctionByDefault(		bool ARMBaseInstrInfo::shouldOutlineFromFunctionByDefault(
MachineFunction &MF) const {		MachineFunction &MF) const {
return Subtarget.isMClass() && MF.getFunction().hasMinSize();		return Subtarget.isMClass() && MF.getFunction().hasMinSize();
}		}

bool ARMBaseInstrInfo::isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool ARMBaseInstrInfo::isReallyTriviallyReMaterializable(
AAResults *AA) const {		const MachineInstr &MI) const {
// Try hard to rematerialize any VCTPs because if we spill P0, it will block		// Try hard to rematerialize any VCTPs because if we spill P0, it will block
// the tail predication conversion. This means that the element count		// the tail predication conversion. This means that the element count
// register has to be live for longer, but that has to be better than		// register has to be live for longer, but that has to be better than
// spill/restore and VPT predication.		// spill/restore and VPT predication.
return isVCTP(&MI) && !isPredicated(MI);		return isVCTP(&MI) && !isPredicated(MI);
}		}

unsigned llvm::getBLXOpcode(const MachineFunction &MF) {		unsigned llvm::getBLXOpcode(const MachineFunction &MF) {
▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrInfo.h

Show First 20 Lines • Show All 489 Lines • ▼ Show 20 Lines	#endif
// and clears nuw, nsw, and exact flags.		// and clears nuw, nsw, and exact flags.
void setSpecialOperandAttr(MachineInstr &MI, uint16_t Flags) const;		void setSpecialOperandAttr(MachineInstr &MI, uint16_t Flags) const;

bool isCoalescableExtInstr(const MachineInstr &MI,		bool isCoalescableExtInstr(const MachineInstr &MI,
Register &SrcReg, Register &DstReg,		Register &SrcReg, Register &DstReg,
unsigned &SubIdx) const override;		unsigned &SubIdx) const override;
unsigned isLoadFromStackSlot(const MachineInstr &MI,		unsigned isLoadFromStackSlot(const MachineInstr &MI,
int &FrameIndex) const override;		int &FrameIndex) const override;
bool isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool isReallyTriviallyReMaterializable(const MachineInstr &MI) const override;
AAResults *AA) const override;
unsigned isStoreToStackSlot(const MachineInstr &MI,		unsigned isStoreToStackSlot(const MachineInstr &MI,
int &FrameIndex) const override;		int &FrameIndex) const override;

bool findCommutedOpIndices(const MachineInstr &MI, unsigned &SrcOpIdx1,		bool findCommutedOpIndices(const MachineInstr &MI, unsigned &SrcOpIdx1,
unsigned &SrcOpIdx2) const override;		unsigned &SrcOpIdx2) const override;

void insertNoop(MachineBasicBlock &MBB,		void insertNoop(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MI) const override;		MachineBasicBlock::iterator MI) const override;
▲ Show 20 Lines • Show All 299 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp

Show First 20 Lines • Show All 1,080 Lines • ▼ Show 20 Lines	if (MI.getOperand(1).isImm() && !MI.getOperand(1).getImm() &&
return MI.getOperand(0).getReg();		return MI.getOperand(0).getReg();
}		}
}		}
return 0;		return 0;
}		}

// For opcodes with the ReMaterializable flag set, this function is called to		// For opcodes with the ReMaterializable flag set, this function is called to
// verify the instruction is really rematable.		// verify the instruction is really rematable.
bool PPCInstrInfo::isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool PPCInstrInfo::isReallyTriviallyReMaterializable(
AliasAnalysis *AA) const {		const MachineInstr &MI) const {
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
default:		default:
// This function should only be called for opcodes with the ReMaterializable		// This function should only be called for opcodes with the ReMaterializable
// flag set.		// flag set.
llvm_unreachable("Unknown rematerializable operation!");		llvm_unreachable("Unknown rematerializable operation!");
break;		break;
case PPC::LI:		case PPC::LI:
case PPC::LI8:		case PPC::LI8:
▲ Show 20 Lines • Show All 4,471 Lines • Show Last 20 Lines

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.h

	Show All 37 Lines
	class WebAssemblyInstrInfo final : public WebAssemblyGenInstrInfo {			class WebAssemblyInstrInfo final : public WebAssemblyGenInstrInfo {
	const WebAssemblyRegisterInfo RI;			const WebAssemblyRegisterInfo RI;

	public:			public:
	explicit WebAssemblyInstrInfo(const WebAssemblySubtarget &STI);			explicit WebAssemblyInstrInfo(const WebAssemblySubtarget &STI);

	const WebAssemblyRegisterInfo &getRegisterInfo() const { return RI; }			const WebAssemblyRegisterInfo &getRegisterInfo() const { return RI; }

	bool isReallyTriviallyReMaterializable(const MachineInstr &MI,			bool isReallyTriviallyReMaterializable(const MachineInstr &MI) const override;
	AAResults *AA) const override;

	void copyPhysReg(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,			void copyPhysReg(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
	const DebugLoc &DL, MCRegister DestReg, MCRegister SrcReg,			const DebugLoc &DL, MCRegister DestReg, MCRegister SrcReg,
	bool KillSrc) const override;			bool KillSrc) const override;
	MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,			MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,
	unsigned OpIdx1,			unsigned OpIdx1,
	unsigned OpIdx2) const override;			unsigned OpIdx2) const override;

	Show All 22 Lines

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp

	Show All 34 Lines

	WebAssemblyInstrInfo::WebAssemblyInstrInfo(const WebAssemblySubtarget &STI)			WebAssemblyInstrInfo::WebAssemblyInstrInfo(const WebAssemblySubtarget &STI)
	: WebAssemblyGenInstrInfo(WebAssembly::ADJCALLSTACKDOWN,			: WebAssemblyGenInstrInfo(WebAssembly::ADJCALLSTACKDOWN,
	WebAssembly::ADJCALLSTACKUP,			WebAssembly::ADJCALLSTACKUP,
	WebAssembly::CATCHRET),			WebAssembly::CATCHRET),
	RI(STI.getTargetTriple()) {}			RI(STI.getTargetTriple()) {}

	bool WebAssemblyInstrInfo::isReallyTriviallyReMaterializable(			bool WebAssemblyInstrInfo::isReallyTriviallyReMaterializable(
	const MachineInstr &MI, AAResults *AA) const {			const MachineInstr &MI) const {
	switch (MI.getOpcode()) {			switch (MI.getOpcode()) {
	case WebAssembly::CONST_I32:			case WebAssembly::CONST_I32:
	case WebAssembly::CONST_I64:			case WebAssembly::CONST_I64:
	case WebAssembly::CONST_F32:			case WebAssembly::CONST_F32:
	case WebAssembly::CONST_F64:			case WebAssembly::CONST_F64:
	// isReallyTriviallyReMaterializableGeneric misses these because of the			// isReallyTriviallyReMaterializableGeneric misses these because of the
	// ARGUMENTS implicit def, so we manualy override it here.			// ARGUMENTS implicit def, so we manualy override it here.
	return true;			return true;
	▲ Show 20 Lines • Show All 171 Lines • Show Last 20 Lines

llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
namespace {		namespace {
class WebAssemblyRegStackify final : public MachineFunctionPass {		class WebAssemblyRegStackify final : public MachineFunctionPass {
StringRef getPassName() const override {		StringRef getPassName() const override {
return "WebAssembly Register Stackify";		return "WebAssembly Register Stackify";
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesCFG();		AU.setPreservesCFG();
AU.addRequired<AAResultsWrapperPass>();
AU.addRequired<MachineDominatorTree>();		AU.addRequired<MachineDominatorTree>();
AU.addRequired<LiveIntervals>();		AU.addRequired<LiveIntervals>();
AU.addPreserved<MachineBlockFrequencyInfo>();		AU.addPreserved<MachineBlockFrequencyInfo>();
AU.addPreserved<SlotIndexes>();		AU.addPreserved<SlotIndexes>();
AU.addPreserved<LiveIntervals>();		AU.addPreserved<LiveIntervals>();
AU.addPreservedID(LiveVariablesID);		AU.addPreservedID(LiveVariablesID);
AU.addPreserved<MachineDominatorTree>();		AU.addPreserved<MachineDominatorTree>();
MachineFunctionPass::getAnalysisUsage(AU);		MachineFunctionPass::getAnalysisUsage(AU);
▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	static void queryCallee(const MachineInstr &MI, bool &Read, bool &Write,
// Assume the worst.		// Assume the worst.
Write = true;		Write = true;
Read = true;		Read = true;
Effects = true;		Effects = true;
}		}

// Determine whether MI reads memory, writes memory, has side effects,		// Determine whether MI reads memory, writes memory, has side effects,
// and/or uses the stack pointer value.		// and/or uses the stack pointer value.
static void query(const MachineInstr &MI, AliasAnalysis &AA, bool &Read,		static void query(const MachineInstr &MI, bool &Read, bool &Write,
bool &Write, bool &Effects, bool &StackPointer) {		bool &Effects, bool &StackPointer) {
assert(!MI.isTerminator());		assert(!MI.isTerminator());

if (MI.isDebugInstr() \|\| MI.isPosition())		if (MI.isDebugInstr() \|\| MI.isPosition())
return;		return;

// Check for loads.		// Check for loads.
if (MI.mayLoad() && !MI.isDereferenceableInvariantLoad(&AA))		if (MI.mayLoad() && !MI.isDereferenceableInvariantLoad())
Read = true;		Read = true;

// Check for stores.		// Check for stores.
if (MI.mayStore()) {		if (MI.mayStore()) {
Write = true;		Write = true;
} else if (MI.hasOrderedMemoryRef()) {		} else if (MI.hasOrderedMemoryRef()) {
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
case WebAssembly::DIV_S_I32:		case WebAssembly::DIV_S_I32:
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	static void query(const MachineInstr &MI, bool &Read, bool &Write,

// Analyze calls.		// Analyze calls.
if (MI.isCall()) {		if (MI.isCall()) {
queryCallee(MI, Read, Write, Effects, StackPointer);		queryCallee(MI, Read, Write, Effects, StackPointer);
}		}
}		}

// Test whether Def is safe and profitable to rematerialize.		// Test whether Def is safe and profitable to rematerialize.
static bool shouldRematerialize(const MachineInstr &Def, AliasAnalysis &AA,		static bool shouldRematerialize(const MachineInstr &Def,
const WebAssemblyInstrInfo *TII) {		const WebAssemblyInstrInfo *TII) {
return Def.isAsCheapAsAMove() && TII->isTriviallyReMaterializable(Def, &AA);		return Def.isAsCheapAsAMove() && TII->isTriviallyReMaterializable(Def);
}		}

// Identify the definition for this register at this point. This is a		// Identify the definition for this register at this point. This is a
// generalization of MachineRegisterInfo::getUniqueVRegDef that uses		// generalization of MachineRegisterInfo::getUniqueVRegDef that uses
// LiveIntervals to handle complex cases.		// LiveIntervals to handle complex cases.
static MachineInstr getVRegDef(unsigned Reg, const MachineInstr Insert,		static MachineInstr getVRegDef(unsigned Reg, const MachineInstr Insert,
const MachineRegisterInfo &MRI,		const MachineRegisterInfo &MRI,
const LiveIntervals &LIS) {		const LiveIntervals &LIS) {
Show All 37 Lines
}		}

// Test whether it's safe to move Def to just before Insert.		// Test whether it's safe to move Def to just before Insert.
// TODO: Compute memory dependencies in a way that doesn't require always		// TODO: Compute memory dependencies in a way that doesn't require always
// walking the block.		// walking the block.
// TODO: Compute memory dependencies in a way that uses AliasAnalysis to be		// TODO: Compute memory dependencies in a way that uses AliasAnalysis to be
// more precise.		// more precise.
static bool isSafeToMove(const MachineOperand Def, const MachineOperand Use,		static bool isSafeToMove(const MachineOperand Def, const MachineOperand Use,
const MachineInstr *Insert, AliasAnalysis &AA,		const MachineInstr *Insert,
const WebAssemblyFunctionInfo &MFI,		const WebAssemblyFunctionInfo &MFI,
const MachineRegisterInfo &MRI) {		const MachineRegisterInfo &MRI) {
const MachineInstr *DefI = Def->getParent();		const MachineInstr *DefI = Def->getParent();
const MachineInstr *UseI = Use->getParent();		const MachineInstr *UseI = Use->getParent();
assert(DefI->getParent() == Insert->getParent());		assert(DefI->getParent() == Insert->getParent());
assert(UseI->getParent() == Insert->getParent());		assert(UseI->getParent() == Insert->getParent());

// The first def of a multivalue instruction can be stackified by moving,		// The first def of a multivalue instruction can be stackified by moving,
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	for (const MachineOperand &MO : DefI->operands()) {
// If one of the operands isn't in SSA form, it has different values at		// If one of the operands isn't in SSA form, it has different values at
// different times, and we need to make sure we don't move our use across		// different times, and we need to make sure we don't move our use across
// a different def.		// a different def.
if (!MO.isDef() && !MRI.hasOneDef(Reg))		if (!MO.isDef() && !MRI.hasOneDef(Reg))
MutableRegisters.push_back(Reg);		MutableRegisters.push_back(Reg);
}		}

bool Read = false, Write = false, Effects = false, StackPointer = false;		bool Read = false, Write = false, Effects = false, StackPointer = false;
query(*DefI, AA, Read, Write, Effects, StackPointer);		query(*DefI, Read, Write, Effects, StackPointer);

// If the instruction does not access memory and has no side effects, it has		// If the instruction does not access memory and has no side effects, it has
// no additional dependencies.		// no additional dependencies.
bool HasMutableRegisters = !MutableRegisters.empty();		bool HasMutableRegisters = !MutableRegisters.empty();
if (!Read && !Write && !Effects && !StackPointer && !HasMutableRegisters)		if (!Read && !Write && !Effects && !StackPointer && !HasMutableRegisters)
return true;		return true;

// Scan through the intervening instructions between DefI and Insert.		// Scan through the intervening instructions between DefI and Insert.
MachineBasicBlock::const_iterator D(DefI), I(Insert);		MachineBasicBlock::const_iterator D(DefI), I(Insert);
for (--I; I != D; --I) {		for (--I; I != D; --I) {
bool InterveningRead = false;		bool InterveningRead = false;
bool InterveningWrite = false;		bool InterveningWrite = false;
bool InterveningEffects = false;		bool InterveningEffects = false;
bool InterveningStackPointer = false;		bool InterveningStackPointer = false;
query(*I, AA, InterveningRead, InterveningWrite, InterveningEffects,		query(*I, InterveningRead, InterveningWrite, InterveningEffects,
InterveningStackPointer);		InterveningStackPointer);
if (Effects && InterveningEffects)		if (Effects && InterveningEffects)
return false;		return false;
if (Read && InterveningWrite)		if (Read && InterveningWrite)
return false;		return false;
if (Write && (InterveningRead \|\| InterveningWrite))		if (Write && (InterveningRead \|\| InterveningWrite))
return false;		return false;
if (StackPointer && InterveningStackPointer)		if (StackPointer && InterveningStackPointer)
▲ Show 20 Lines • Show All 385 Lines • ▼ Show 20 Lines	LLVM_DEBUG(dbgs() << "******** Register Stackifying ********\n"
"********** Function: "		"********** Function: "
<< MF.getName() << '\n');		<< MF.getName() << '\n');

bool Changed = false;		bool Changed = false;
MachineRegisterInfo &MRI = MF.getRegInfo();		MachineRegisterInfo &MRI = MF.getRegInfo();
WebAssemblyFunctionInfo &MFI = *MF.getInfo<WebAssemblyFunctionInfo>();		WebAssemblyFunctionInfo &MFI = *MF.getInfo<WebAssemblyFunctionInfo>();
const auto *TII = MF.getSubtarget<WebAssemblySubtarget>().getInstrInfo();		const auto *TII = MF.getSubtarget<WebAssemblySubtarget>().getInstrInfo();
const auto *TRI = MF.getSubtarget<WebAssemblySubtarget>().getRegisterInfo();		const auto *TRI = MF.getSubtarget<WebAssemblySubtarget>().getRegisterInfo();
AliasAnalysis &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
auto &MDT = getAnalysis<MachineDominatorTree>();		auto &MDT = getAnalysis<MachineDominatorTree>();
auto &LIS = getAnalysis<LiveIntervals>();		auto &LIS = getAnalysis<LiveIntervals>();

// Walk the instructions from the bottom up. Currently we don't look past		// Walk the instructions from the bottom up. Currently we don't look past
// block boundaries, and the blocks aren't ordered so the block visitation		// block boundaries, and the blocks aren't ordered so the block visitation
// order isn't significant, but we may want to change this in the future.		// order isn't significant, but we may want to change this in the future.
for (MachineBasicBlock &MBB : MF) {		for (MachineBasicBlock &MBB : MF) {
// Don't use a range-based for loop, because we modify the list as we're		// Don't use a range-based for loop, because we modify the list as we're
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	for (auto MII = MBB.rbegin(); MII != MBB.rend(); ++MII) {

// Decide which strategy to take. Prefer to move a single-use value		// Decide which strategy to take. Prefer to move a single-use value
// over cloning it, and prefer cloning over introducing a tee.		// over cloning it, and prefer cloning over introducing a tee.
// For moving, we require the def to be in the same block as the use;		// For moving, we require the def to be in the same block as the use;
// this makes things simpler (LiveIntervals' handleMove function only		// this makes things simpler (LiveIntervals' handleMove function only
// supports intra-block moves) and it's MachineSink's job to catch all		// supports intra-block moves) and it's MachineSink's job to catch all
// the sinking opportunities anyway.		// the sinking opportunities anyway.
bool SameBlock = DefI->getParent() == &MBB;		bool SameBlock = DefI->getParent() == &MBB;
bool CanMove = SameBlock &&		bool CanMove = SameBlock && isSafeToMove(Def, &Use, Insert, MFI, MRI) &&
isSafeToMove(Def, &Use, Insert, AA, MFI, MRI) &&
!TreeWalker.isOnStack(Reg);		!TreeWalker.isOnStack(Reg);
if (CanMove && hasOneUse(Reg, DefI, MRI, MDT, LIS)) {		if (CanMove && hasOneUse(Reg, DefI, MRI, MDT, LIS)) {
Insert = moveForSingleUse(Reg, Use, DefI, MBB, Insert, LIS, MFI, MRI);		Insert = moveForSingleUse(Reg, Use, DefI, MBB, Insert, LIS, MFI, MRI);

// If we are removing the frame base reg completely, remove the debug		// If we are removing the frame base reg completely, remove the debug
// info as well.		// info as well.
// TODO: Encode this properly as a stackified value.		// TODO: Encode this properly as a stackified value.
if (MFI.isFrameBaseVirtual() && MFI.getFrameBaseVreg() == Reg)		if (MFI.isFrameBaseVirtual() && MFI.getFrameBaseVreg() == Reg)
MFI.clearFrameBaseVreg();		MFI.clearFrameBaseVreg();
} else if (shouldRematerialize(*DefI, AA, TII)) {		} else if (shouldRematerialize(*DefI, TII)) {
Insert =		Insert =
rematerializeCheapDef(Reg, Use, *DefI, MBB, Insert->getIterator(),		rematerializeCheapDef(Reg, Use, *DefI, MBB, Insert->getIterator(),
LIS, MFI, MRI, TII, TRI);		LIS, MFI, MRI, TII, TRI);
} else if (CanMove && oneUseDominatesOtherUses(Reg, Use, MBB, MRI, MDT,		} else if (CanMove && oneUseDominatesOtherUses(Reg, Use, MBB, MRI, MDT,
LIS, MFI)) {		LIS, MFI)) {
Insert = moveAndTeeForMultiUse(Reg, Use, DefI, MBB, Insert, LIS, MFI,		Insert = moveAndTeeForMultiUse(Reg, Use, DefI, MBB, Insert, LIS, MFI,
MRI, TII);		MRI, TII);
} else {		} else {
▲ Show 20 Lines • Show All 89 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrInfo.h

Show First 20 Lines • Show All 234 Lines • ▼ Show 20 Lines	unsigned isStoreToStackSlot(const MachineInstr &MI,
int &FrameIndex,		int &FrameIndex,
unsigned &MemBytes) const override;		unsigned &MemBytes) const override;
/// isStoreToStackSlotPostFE - Check for post-frame ptr elimination		/// isStoreToStackSlotPostFE - Check for post-frame ptr elimination
/// stack locations as well. This uses a heuristic so it isn't		/// stack locations as well. This uses a heuristic so it isn't
/// reliable for correctness.		/// reliable for correctness.
unsigned isStoreToStackSlotPostFE(const MachineInstr &MI,		unsigned isStoreToStackSlotPostFE(const MachineInstr &MI,
int &FrameIndex) const override;		int &FrameIndex) const override;

bool isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool isReallyTriviallyReMaterializable(const MachineInstr &MI) const override;
AAResults *AA) const override;
void reMaterialize(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,		void reMaterialize(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
Register DestReg, unsigned SubIdx,		Register DestReg, unsigned SubIdx,
const MachineInstr &Orig,		const MachineInstr &Orig,
const TargetRegisterInfo &TRI) const override;		const TargetRegisterInfo &TRI) const override;

/// Given an operand within a MachineInstr, insert preceding code to put it		/// Given an operand within a MachineInstr, insert preceding code to put it
/// into the right format for a particular kind of LEA instruction. This may		/// into the right format for a particular kind of LEA instruction. This may
/// involve using an appropriate super-register instead (with an implicit use		/// involve using an appropriate super-register instead (with an implicit use
▲ Show 20 Lines • Show All 410 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 736 Lines • ▼ Show 20 Lines	for (MachineRegisterInfo::def_instr_iterator I = MRI.def_instr_begin(BaseReg),
if (DefMI->getOpcode() != X86::MOVPC32r)		if (DefMI->getOpcode() != X86::MOVPC32r)
return false;		return false;
assert(!isPICBase && "More than one PIC base?");		assert(!isPICBase && "More than one PIC base?");
isPICBase = true;		isPICBase = true;
}		}
return isPICBase;		return isPICBase;
}		}

bool X86InstrInfo::isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool X86InstrInfo::isReallyTriviallyReMaterializable(
AAResults *AA) const {		const MachineInstr &MI) const {
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
default:		default:
// This function should only be called for opcodes with the ReMaterializable		// This function should only be called for opcodes with the ReMaterializable
// flag set.		// flag set.
llvm_unreachable("Unknown rematerializable operation!");		llvm_unreachable("Unknown rematerializable operation!");
break;		break;

case X86::LOAD_STACK_GUARD:		case X86::LOAD_STACK_GUARD:
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	bool X86InstrInfo::isReallyTriviallyReMaterializable(
case X86::VMOVUPSZ128rm_NOVLX:		case X86::VMOVUPSZ128rm_NOVLX:
case X86::VMOVUPSZ256rm_NOVLX:		case X86::VMOVUPSZ256rm_NOVLX:
case X86::VMOVUPSZrm: {		case X86::VMOVUPSZrm: {
// Loads from constant pools are trivially rematerializable.		// Loads from constant pools are trivially rematerializable.
if (MI.getOperand(1 + X86::AddrBaseReg).isReg() &&		if (MI.getOperand(1 + X86::AddrBaseReg).isReg() &&
MI.getOperand(1 + X86::AddrScaleAmt).isImm() &&		MI.getOperand(1 + X86::AddrScaleAmt).isImm() &&
MI.getOperand(1 + X86::AddrIndexReg).isReg() &&		MI.getOperand(1 + X86::AddrIndexReg).isReg() &&
MI.getOperand(1 + X86::AddrIndexReg).getReg() == 0 &&		MI.getOperand(1 + X86::AddrIndexReg).getReg() == 0 &&
MI.isDereferenceableInvariantLoad(AA)) {		MI.isDereferenceableInvariantLoad()) {
Register BaseReg = MI.getOperand(1 + X86::AddrBaseReg).getReg();		Register BaseReg = MI.getOperand(1 + X86::AddrBaseReg).getReg();
if (BaseReg == 0 \|\| BaseReg == X86::RIP)		if (BaseReg == 0 \|\| BaseReg == X86::RIP)
return true;		return true;
// Allow re-materialization of PIC load.		// Allow re-materialization of PIC load.
if (!ReMatPICStubLoad && MI.getOperand(1 + X86::AddrDisp).isGlobal())		if (!ReMatPICStubLoad && MI.getOperand(1 + X86::AddrDisp).isGlobal())
return false;		return false;
const MachineFunction &MF = *MI.getParent()->getParent();		const MachineFunction &MF = *MI.getParent()->getParent();
const MachineRegisterInfo &MRI = MF.getRegInfo();		const MachineRegisterInfo &MRI = MF.getRegInfo();
▲ Show 20 Lines • Show All 8,796 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll

	Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	; RUN: --debugify-and-strip-all-safe=0 \			; RUN: --debugify-and-strip-all-safe=0 \
	; RUN: -verify-machineinstrs=0 \| FileCheck %s --check-prefix DISABLED			; RUN: -verify-machineinstrs=0 \| FileCheck %s --check-prefix DISABLED

	; RUN: llc -mtriple=aarch64-- -fast-isel=0 -global-isel=false \			; RUN: llc -mtriple=aarch64-- -fast-isel=0 -global-isel=false \
	; RUN: --debugify-and-strip-all-safe=0 \			; RUN: --debugify-and-strip-all-safe=0 \
	; RUN: -debug-pass=Structure %s -o /dev/null 2>&1 -verify-machineinstrs=0 \			; RUN: -debug-pass=Structure %s -o /dev/null 2>&1 -verify-machineinstrs=0 \
	; RUN: \| FileCheck %s --check-prefix DISABLED			; RUN: \| FileCheck %s --check-prefix DISABLED

				; ENABLED: Safe Stack instrumentation pass

				; ENABLED-O1: Basic Alias Analysis (stateless AA impl)
				; ENABLED-O1-NEXT: Function Alias Analysis Results
	; ENABLED: IRTranslator			; ENABLED: IRTranslator
	; VERIFY-NEXT: Verify generated machine code			; VERIFY-NEXT: Verify generated machine code
	; ENABLED-NEXT: Analysis for ComputingKnownBits			; ENABLED-NEXT: Analysis for ComputingKnownBits
	; ENABLED-O1-NEXT: MachineDominator Tree Construction			; ENABLED-O1-NEXT: MachineDominator Tree Construction
	; ENABLED-O1-NEXT: Analysis containing CSE Info			; ENABLED-O1-NEXT: Analysis containing CSE Info
	; ENABLED-O1-NEXT: PreLegalizerCombiner			; ENABLED-O1-NEXT: PreLegalizerCombiner
	; VERIFY-O0-NEXT: AArch64O0PreLegalizerCombiner			; VERIFY-O0-NEXT: AArch64O0PreLegalizerCombiner
	; VERIFY-NEXT: Verify generated machine code			; VERIFY-NEXT: Verify generated machine code
	; ENABLED-O1-NEXT: Basic Alias Analysis (stateless AA impl)
	; ENABLED-O1-NEXT: Function Alias Analysis Results
	; ENABLED-O1-NEXT: LoadStoreOpt			; ENABLED-O1-NEXT: LoadStoreOpt
	; VERIFY-O0-NEXT: Analysis containing CSE Info			; VERIFY-O0-NEXT: Analysis containing CSE Info
	; ENABLED-NEXT: Legalizer			; ENABLED-NEXT: Legalizer
	; VERIFY-NEXT: Verify generated machine code			; VERIFY-NEXT: Verify generated machine code
	; ENABLED: RegBankSelect			; ENABLED: RegBankSelect
	; VERIFY-NEXT: Verify generated machine code			; VERIFY-NEXT: Verify generated machine code
	; ENABLED-NEXT: Localizer			; ENABLED-NEXT: Localizer
	; VERIFY-O0-NEXT: Verify generated machine code			; VERIFY-O0-NEXT: Verify generated machine code
	Show All 19 Lines

llvm/test/CodeGen/AArch64/arm64-memcpy-inline.ll

Show All 22 Lines	; CHECK-DAG: str [[REG2]],
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 getelementptr inbounds (%struct.x, %struct.x* @dst, i32 0, i32 0), i8* align 8 getelementptr inbounds (%struct.x, %struct.x* @src, i32 0, i32 0), i32 11, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 getelementptr inbounds (%struct.x, %struct.x* @dst, i32 0, i32 0), i8* align 8 getelementptr inbounds (%struct.x, %struct.x* @src, i32 0, i32 0), i32 11, i1 false)
ret i32 0		ret i32 0
}		}

define void @t1(i8* nocapture %C) nounwind {		define void @t1(i8* nocapture %C) nounwind {
entry:		entry:
; CHECK-LABEL: t1:		; CHECK-LABEL: t1:
; CHECK: ldr [[DEST:q[0-9]+]], [x[[BASEREG]]]		; CHECK: ldr [[DEST:q[0-9]+]], [x[[BASEREG]]]
		; CHECK: str [[DEST:q[0-9]+]], [x0]
; CHECK: ldur [[DEST:q[0-9]+]], [x[[BASEREG:[0-9]+]], #15]		; CHECK: ldur [[DEST:q[0-9]+]], [x[[BASEREG:[0-9]+]], #15]
; CHECK: stur [[DEST:q[0-9]+]], [x0, #15]		; CHECK: stur [[DEST:q[0-9]+]], [x0, #15]
; CHECK: str [[DEST:q[0-9]+]], [x0]
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %C, i8* getelementptr inbounds ([31 x i8], [31 x i8]* @.str1, i64 0, i64 0), i64 31, i1 false)		tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %C, i8* getelementptr inbounds ([31 x i8], [31 x i8]* @.str1, i64 0, i64 0), i64 31, i1 false)
ret void		ret void
}		}

define void @t2(i8* nocapture %C) nounwind {		define void @t2(i8* nocapture %C) nounwind {
entry:		entry:
; CHECK-LABEL: t2:		; CHECK-LABEL: t2:
; CHECK: mov [[REG3:w[0-9]+]]		; CHECK: mov [[REG3:w[0-9]+]]
; CHECK: movk [[REG3]],		; CHECK: movk [[REG3]],
; CHECK: str [[REG3]], [x0, #32]		; CHECK: str [[REG3]], [x0, #32]
; CHECK: ldp [[DEST1:q[0-9]+]], [[DEST2:q[0-9]+]], [x{{[0-9]+}}]		; CHECK: ldp [[DEST1:q[0-9]+]], [[DEST2:q[0-9]+]], [x{{[0-9]+}}]
; CHECK: stp [[DEST1]], [[DEST2]], [x0]		; CHECK: stp [[DEST1]], [[DEST2]], [x0]
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %C, i8* getelementptr inbounds ([36 x i8], [36 x i8]* @.str2, i64 0, i64 0), i64 36, i1 false)		tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %C, i8* getelementptr inbounds ([36 x i8], [36 x i8]* @.str2, i64 0, i64 0), i64 36, i1 false)
ret void		ret void
}		}

define void @t3(i8* nocapture %C) nounwind {		define void @t3(i8* nocapture %C) nounwind {
entry:		entry:
; CHECK-LABEL: t3:		; CHECK-LABEL: t3:
; CHECK: ldr [[DEST:q[0-9]+]], [x[[BASEREG]]]		; CHECK: ldr [[DEST:q[0-9]+]], [x[[BASEREG]]]
		; CHECK: str [[DEST]], [x0]
; CHECK: ldr [[REG4:x[0-9]+]], [x[[BASEREG:[0-9]+]], #16]		; CHECK: ldr [[REG4:x[0-9]+]], [x[[BASEREG:[0-9]+]], #16]
; CHECK: str [[REG4]], [x0, #16]		; CHECK: str [[REG4]], [x0, #16]
; CHECK: str [[DEST]], [x0]
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %C, i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str3, i64 0, i64 0), i64 24, i1 false)		tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %C, i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str3, i64 0, i64 0), i64 24, i1 false)
ret void		ret void
}		}

define void @t4(i8* nocapture %C) nounwind {		define void @t4(i8* nocapture %C) nounwind {
entry:		entry:
; CHECK-LABEL: t4:		; CHECK-LABEL: t4:
; CHECK: mov [[REG5:w[0-9]+]], #32		; CHECK: mov [[REG5:w[0-9]+]], #32
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/function-returns.ll

Show First 20 Lines • Show All 433 Lines • ▼ Show 20 Lines	define <5 x i32> @v5i32_func_void() #0 {
%val = load volatile <5 x i32>, <5 x i32> addrspace(1)* undef		%val = load volatile <5 x i32>, <5 x i32> addrspace(1)* undef
ret <5 x i32> %val		ret <5 x i32> %val
}		}

define <8 x i32> @v8i32_func_void() #0 {		define <8 x i32> @v8i32_func_void() #0 {
; CHECK-LABEL: name: v8i32_func_void		; CHECK-LABEL: name: v8i32_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<8 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<8 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<8 x s32>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<8 x s32>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<8 x s32>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<8 x s32>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
; CHECK-NEXT: $vgpr6 = COPY [[UV6]](s32)		; CHECK-NEXT: $vgpr6 = COPY [[UV6]](s32)
; CHECK-NEXT: $vgpr7 = COPY [[UV7]](s32)		; CHECK-NEXT: $vgpr7 = COPY [[UV7]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5, implicit $vgpr6, implicit $vgpr7		; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5, implicit $vgpr6, implicit $vgpr7
%ptr = load volatile <8 x i32> addrspace(1), <8 x i32> addrspace(1) addrspace(4)* undef		%ptr = load volatile <8 x i32> addrspace(1), <8 x i32> addrspace(1) addrspace(4)* undef
%val = load <8 x i32>, <8 x i32> addrspace(1)* %ptr		%val = load <8 x i32>, <8 x i32> addrspace(1)* %ptr
ret <8 x i32> %val		ret <8 x i32> %val
}		}

define <16 x i32> @v16i32_func_void() #0 {		define <16 x i32> @v16i32_func_void() #0 {
; CHECK-LABEL: name: v16i32_func_void		; CHECK-LABEL: name: v16i32_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<16 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<16 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<16 x s32>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<16 x s32>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32), [[UV10:%[0-9]+]]:_(s32), [[UV11:%[0-9]+]]:_(s32), [[UV12:%[0-9]+]]:_(s32), [[UV13:%[0-9]+]]:_(s32), [[UV14:%[0-9]+]]:_(s32), [[UV15:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<16 x s32>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32), [[UV10:%[0-9]+]]:_(s32), [[UV11:%[0-9]+]]:_(s32), [[UV12:%[0-9]+]]:_(s32), [[UV13:%[0-9]+]]:_(s32), [[UV14:%[0-9]+]]:_(s32), [[UV15:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<16 x s32>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
Show All 12 Lines	define <16 x i32> @v16i32_func_void() #0 {
%val = load <16 x i32>, <16 x i32> addrspace(1)* %ptr		%val = load <16 x i32>, <16 x i32> addrspace(1)* %ptr
ret <16 x i32> %val		ret <16 x i32> %val
}		}

define <32 x i32> @v32i32_func_void() #0 {		define <32 x i32> @v32i32_func_void() #0 {
; CHECK-LABEL: name: v32i32_func_void		; CHECK-LABEL: name: v32i32_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<32 x s32>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<32 x s32>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32), [[UV10:%[0-9]+]]:_(s32), [[UV11:%[0-9]+]]:_(s32), [[UV12:%[0-9]+]]:_(s32), [[UV13:%[0-9]+]]:_(s32), [[UV14:%[0-9]+]]:_(s32), [[UV15:%[0-9]+]]:_(s32), [[UV16:%[0-9]+]]:_(s32), [[UV17:%[0-9]+]]:_(s32), [[UV18:%[0-9]+]]:_(s32), [[UV19:%[0-9]+]]:_(s32), [[UV20:%[0-9]+]]:_(s32), [[UV21:%[0-9]+]]:_(s32), [[UV22:%[0-9]+]]:_(s32), [[UV23:%[0-9]+]]:_(s32), [[UV24:%[0-9]+]]:_(s32), [[UV25:%[0-9]+]]:_(s32), [[UV26:%[0-9]+]]:_(s32), [[UV27:%[0-9]+]]:_(s32), [[UV28:%[0-9]+]]:_(s32), [[UV29:%[0-9]+]]:_(s32), [[UV30:%[0-9]+]]:_(s32), [[UV31:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<32 x s32>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32), [[UV10:%[0-9]+]]:_(s32), [[UV11:%[0-9]+]]:_(s32), [[UV12:%[0-9]+]]:_(s32), [[UV13:%[0-9]+]]:_(s32), [[UV14:%[0-9]+]]:_(s32), [[UV15:%[0-9]+]]:_(s32), [[UV16:%[0-9]+]]:_(s32), [[UV17:%[0-9]+]]:_(s32), [[UV18:%[0-9]+]]:_(s32), [[UV19:%[0-9]+]]:_(s32), [[UV20:%[0-9]+]]:_(s32), [[UV21:%[0-9]+]]:_(s32), [[UV22:%[0-9]+]]:_(s32), [[UV23:%[0-9]+]]:_(s32), [[UV24:%[0-9]+]]:_(s32), [[UV25:%[0-9]+]]:_(s32), [[UV26:%[0-9]+]]:_(s32), [[UV27:%[0-9]+]]:_(s32), [[UV28:%[0-9]+]]:_(s32), [[UV29:%[0-9]+]]:_(s32), [[UV30:%[0-9]+]]:_(s32), [[UV31:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<32 x s32>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	define <2 x i64> @v2i64_func_void() #0 {
%val = load <2 x i64>, <2 x i64> addrspace(1)* undef		%val = load <2 x i64>, <2 x i64> addrspace(1)* undef
ret <2 x i64> %val		ret <2 x i64> %val
}		}

define <3 x i64> @v3i64_func_void() #0 {		define <3 x i64> @v3i64_func_void() #0 {
; CHECK-LABEL: name: v3i64_func_void		; CHECK-LABEL: name: v3i64_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<3 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<3 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<3 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<3 x s64>) from %ir.ptr, align 32, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<3 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<3 x s64>) from %ir.ptr, align 32, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<3 x s64>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<3 x s64>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5		; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5
%ptr = load volatile <3 x i64> addrspace(1), <3 x i64> addrspace(1) addrspace(4)* undef		%ptr = load volatile <3 x i64> addrspace(1), <3 x i64> addrspace(1) addrspace(4)* undef
%val = load <3 x i64>, <3 x i64> addrspace(1)* %ptr		%val = load <3 x i64>, <3 x i64> addrspace(1)* %ptr
ret <3 x i64> %val		ret <3 x i64> %val
}		}

define <4 x i64> @v4i64_func_void() #0 {		define <4 x i64> @v4i64_func_void() #0 {
; CHECK-LABEL: name: v4i64_func_void		; CHECK-LABEL: name: v4i64_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<4 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<4 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<4 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<4 x s64>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<4 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<4 x s64>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<4 x s64>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<4 x s64>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
; CHECK-NEXT: $vgpr6 = COPY [[UV6]](s32)		; CHECK-NEXT: $vgpr6 = COPY [[UV6]](s32)
; CHECK-NEXT: $vgpr7 = COPY [[UV7]](s32)		; CHECK-NEXT: $vgpr7 = COPY [[UV7]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5, implicit $vgpr6, implicit $vgpr7		; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5, implicit $vgpr6, implicit $vgpr7
%ptr = load volatile <4 x i64> addrspace(1), <4 x i64> addrspace(1) addrspace(4)* undef		%ptr = load volatile <4 x i64> addrspace(1), <4 x i64> addrspace(1) addrspace(4)* undef
%val = load <4 x i64>, <4 x i64> addrspace(1)* %ptr		%val = load <4 x i64>, <4 x i64> addrspace(1)* %ptr
ret <4 x i64> %val		ret <4 x i64> %val
}		}

define <5 x i64> @v5i64_func_void() #0 {		define <5 x i64> @v5i64_func_void() #0 {
; CHECK-LABEL: name: v5i64_func_void		; CHECK-LABEL: name: v5i64_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<5 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<5 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<5 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<5 x s64>) from %ir.ptr, align 64, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<5 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<5 x s64>) from %ir.ptr, align 64, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<5 x s64>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<5 x s64>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
; CHECK-NEXT: $vgpr6 = COPY [[UV6]](s32)		; CHECK-NEXT: $vgpr6 = COPY [[UV6]](s32)
; CHECK-NEXT: $vgpr7 = COPY [[UV7]](s32)		; CHECK-NEXT: $vgpr7 = COPY [[UV7]](s32)
; CHECK-NEXT: $vgpr8 = COPY [[UV8]](s32)		; CHECK-NEXT: $vgpr8 = COPY [[UV8]](s32)
; CHECK-NEXT: $vgpr9 = COPY [[UV9]](s32)		; CHECK-NEXT: $vgpr9 = COPY [[UV9]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5, implicit $vgpr6, implicit $vgpr7, implicit $vgpr8, implicit $vgpr9		; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5, implicit $vgpr6, implicit $vgpr7, implicit $vgpr8, implicit $vgpr9
%ptr = load volatile <5 x i64> addrspace(1), <5 x i64> addrspace(1) addrspace(4)* undef		%ptr = load volatile <5 x i64> addrspace(1), <5 x i64> addrspace(1) addrspace(4)* undef
%val = load <5 x i64>, <5 x i64> addrspace(1)* %ptr		%val = load <5 x i64>, <5 x i64> addrspace(1)* %ptr
ret <5 x i64> %val		ret <5 x i64> %val
}		}

define <8 x i64> @v8i64_func_void() #0 {		define <8 x i64> @v8i64_func_void() #0 {
; CHECK-LABEL: name: v8i64_func_void		; CHECK-LABEL: name: v8i64_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<8 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<8 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<8 x s64>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<8 x s64>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32), [[UV10:%[0-9]+]]:_(s32), [[UV11:%[0-9]+]]:_(s32), [[UV12:%[0-9]+]]:_(s32), [[UV13:%[0-9]+]]:_(s32), [[UV14:%[0-9]+]]:_(s32), [[UV15:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<8 x s64>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32), [[UV10:%[0-9]+]]:_(s32), [[UV11:%[0-9]+]]:_(s32), [[UV12:%[0-9]+]]:_(s32), [[UV13:%[0-9]+]]:_(s32), [[UV14:%[0-9]+]]:_(s32), [[UV15:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<8 x s64>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
Show All 12 Lines	define <8 x i64> @v8i64_func_void() #0 {
%val = load <8 x i64>, <8 x i64> addrspace(1)* %ptr		%val = load <8 x i64>, <8 x i64> addrspace(1)* %ptr
ret <8 x i64> %val		ret <8 x i64> %val
}		}

define <16 x i64> @v16i64_func_void() #0 {		define <16 x i64> @v16i64_func_void() #0 {
; CHECK-LABEL: name: v16i64_func_void		; CHECK-LABEL: name: v16i64_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<16 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<16 x i64> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<16 x s64>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s64>) = G_LOAD [[LOAD]](p1) :: (load (<16 x s64>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32), [[UV10:%[0-9]+]]:_(s32), [[UV11:%[0-9]+]]:_(s32), [[UV12:%[0-9]+]]:_(s32), [[UV13:%[0-9]+]]:_(s32), [[UV14:%[0-9]+]]:_(s32), [[UV15:%[0-9]+]]:_(s32), [[UV16:%[0-9]+]]:_(s32), [[UV17:%[0-9]+]]:_(s32), [[UV18:%[0-9]+]]:_(s32), [[UV19:%[0-9]+]]:_(s32), [[UV20:%[0-9]+]]:_(s32), [[UV21:%[0-9]+]]:_(s32), [[UV22:%[0-9]+]]:_(s32), [[UV23:%[0-9]+]]:_(s32), [[UV24:%[0-9]+]]:_(s32), [[UV25:%[0-9]+]]:_(s32), [[UV26:%[0-9]+]]:_(s32), [[UV27:%[0-9]+]]:_(s32), [[UV28:%[0-9]+]]:_(s32), [[UV29:%[0-9]+]]:_(s32), [[UV30:%[0-9]+]]:_(s32), [[UV31:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<16 x s64>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32), [[UV2:%[0-9]+]]:_(s32), [[UV3:%[0-9]+]]:_(s32), [[UV4:%[0-9]+]]:_(s32), [[UV5:%[0-9]+]]:_(s32), [[UV6:%[0-9]+]]:_(s32), [[UV7:%[0-9]+]]:_(s32), [[UV8:%[0-9]+]]:_(s32), [[UV9:%[0-9]+]]:_(s32), [[UV10:%[0-9]+]]:_(s32), [[UV11:%[0-9]+]]:_(s32), [[UV12:%[0-9]+]]:_(s32), [[UV13:%[0-9]+]]:_(s32), [[UV14:%[0-9]+]]:_(s32), [[UV15:%[0-9]+]]:_(s32), [[UV16:%[0-9]+]]:_(s32), [[UV17:%[0-9]+]]:_(s32), [[UV18:%[0-9]+]]:_(s32), [[UV19:%[0-9]+]]:_(s32), [[UV20:%[0-9]+]]:_(s32), [[UV21:%[0-9]+]]:_(s32), [[UV22:%[0-9]+]]:_(s32), [[UV23:%[0-9]+]]:_(s32), [[UV24:%[0-9]+]]:_(s32), [[UV25:%[0-9]+]]:_(s32), [[UV26:%[0-9]+]]:_(s32), [[UV27:%[0-9]+]]:_(s32), [[UV28:%[0-9]+]]:_(s32), [[UV29:%[0-9]+]]:_(s32), [[UV30:%[0-9]+]]:_(s32), [[UV31:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](<16 x s64>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](s32)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](s32)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](s32)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](s32)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](s32)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](s32)
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	define <4 x half> @v4f16_func_void() #0 {
%val = load <4 x half>, <4 x half> addrspace(1)* undef		%val = load <4 x half>, <4 x half> addrspace(1)* undef
ret <4 x half> %val		ret <4 x half> %val
}		}

define <5 x i16> @v5i16_func_void() #0 {		define <5 x i16> @v5i16_func_void() #0 {
; CHECK-LABEL: name: v5i16_func_void		; CHECK-LABEL: name: v5i16_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<5 x i16> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<5 x i16> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<5 x s16>) = G_LOAD [[LOAD]](p1) :: (load (<5 x s16>) from %ir.ptr, align 16, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<5 x s16>) = G_LOAD [[LOAD]](p1) :: (load (<5 x s16>) from %ir.ptr, align 16, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s16), [[UV1:%[0-9]+]]:_(s16), [[UV2:%[0-9]+]]:_(s16), [[UV3:%[0-9]+]]:_(s16), [[UV4:%[0-9]+]]:_(s16) = G_UNMERGE_VALUES [[LOAD1]](<5 x s16>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s16), [[UV1:%[0-9]+]]:_(s16), [[UV2:%[0-9]+]]:_(s16), [[UV3:%[0-9]+]]:_(s16), [[UV4:%[0-9]+]]:_(s16) = G_UNMERGE_VALUES [[LOAD1]](<5 x s16>)
; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(s16) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<6 x s16>) = G_BUILD_VECTOR [[UV]](s16), [[UV1]](s16), [[UV2]](s16), [[UV3]](s16), [[UV4]](s16), [[DEF1]](s16)		; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<6 x s16>) = G_BUILD_VECTOR [[UV]](s16), [[UV1]](s16), [[UV2]](s16), [[UV3]](s16), [[UV4]](s16), [[DEF1]](s16)
; CHECK-NEXT: [[UV5:%[0-9]+]]:_(<2 x s16>), [[UV6:%[0-9]+]]:_(<2 x s16>), [[UV7:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[BUILD_VECTOR]](<6 x s16>)		; CHECK-NEXT: [[UV5:%[0-9]+]]:_(<2 x s16>), [[UV6:%[0-9]+]]:_(<2 x s16>), [[UV7:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[BUILD_VECTOR]](<6 x s16>)
; CHECK-NEXT: $vgpr0 = COPY [[UV5]](<2 x s16>)		; CHECK-NEXT: $vgpr0 = COPY [[UV5]](<2 x s16>)
; CHECK-NEXT: $vgpr1 = COPY [[UV6]](<2 x s16>)		; CHECK-NEXT: $vgpr1 = COPY [[UV6]](<2 x s16>)
; CHECK-NEXT: $vgpr2 = COPY [[UV7]](<2 x s16>)		; CHECK-NEXT: $vgpr2 = COPY [[UV7]](<2 x s16>)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2		; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2
%ptr = load volatile <5 x i16> addrspace(1), <5 x i16> addrspace(1) addrspace(4)* undef		%ptr = load volatile <5 x i16> addrspace(1), <5 x i16> addrspace(1) addrspace(4)* undef
%val = load <5 x i16>, <5 x i16> addrspace(1)* %ptr		%val = load <5 x i16>, <5 x i16> addrspace(1)* %ptr
ret <5 x i16> %val		ret <5 x i16> %val
}		}

define <8 x i16> @v8i16_func_void() #0 {		define <8 x i16> @v8i16_func_void() #0 {
; CHECK-LABEL: name: v8i16_func_void		; CHECK-LABEL: name: v8i16_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<8 x i16> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<8 x i16> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s16>) = G_LOAD [[LOAD]](p1) :: (load (<8 x s16>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s16>) = G_LOAD [[LOAD]](p1) :: (load (<8 x s16>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>), [[UV2:%[0-9]+]]:_(<2 x s16>), [[UV3:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[LOAD1]](<8 x s16>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>), [[UV2:%[0-9]+]]:_(<2 x s16>), [[UV3:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[LOAD1]](<8 x s16>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](<2 x s16>)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](<2 x s16>)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](<2 x s16>)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](<2 x s16>)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](<2 x s16>)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](<2 x s16>)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](<2 x s16>)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](<2 x s16>)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3		; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3
%ptr = load volatile <8 x i16> addrspace(1), <8 x i16> addrspace(1) addrspace(4)* undef		%ptr = load volatile <8 x i16> addrspace(1), <8 x i16> addrspace(1) addrspace(4)* undef
%val = load <8 x i16>, <8 x i16> addrspace(1)* %ptr		%val = load <8 x i16>, <8 x i16> addrspace(1)* %ptr
ret <8 x i16> %val		ret <8 x i16> %val
}		}

define <16 x i16> @v16i16_func_void() #0 {		define <16 x i16> @v16i16_func_void() #0 {
; CHECK-LABEL: name: v16i16_func_void		; CHECK-LABEL: name: v16i16_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<16 x i16> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<16 x i16> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s16>) = G_LOAD [[LOAD]](p1) :: (load (<16 x s16>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s16>) = G_LOAD [[LOAD]](p1) :: (load (<16 x s16>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>), [[UV2:%[0-9]+]]:_(<2 x s16>), [[UV3:%[0-9]+]]:_(<2 x s16>), [[UV4:%[0-9]+]]:_(<2 x s16>), [[UV5:%[0-9]+]]:_(<2 x s16>), [[UV6:%[0-9]+]]:_(<2 x s16>), [[UV7:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[LOAD1]](<16 x s16>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(<2 x s16>), [[UV1:%[0-9]+]]:_(<2 x s16>), [[UV2:%[0-9]+]]:_(<2 x s16>), [[UV3:%[0-9]+]]:_(<2 x s16>), [[UV4:%[0-9]+]]:_(<2 x s16>), [[UV5:%[0-9]+]]:_(<2 x s16>), [[UV6:%[0-9]+]]:_(<2 x s16>), [[UV7:%[0-9]+]]:_(<2 x s16>) = G_UNMERGE_VALUES [[LOAD1]](<16 x s16>)
; CHECK-NEXT: $vgpr0 = COPY [[UV]](<2 x s16>)		; CHECK-NEXT: $vgpr0 = COPY [[UV]](<2 x s16>)
; CHECK-NEXT: $vgpr1 = COPY [[UV1]](<2 x s16>)		; CHECK-NEXT: $vgpr1 = COPY [[UV1]](<2 x s16>)
; CHECK-NEXT: $vgpr2 = COPY [[UV2]](<2 x s16>)		; CHECK-NEXT: $vgpr2 = COPY [[UV2]](<2 x s16>)
; CHECK-NEXT: $vgpr3 = COPY [[UV3]](<2 x s16>)		; CHECK-NEXT: $vgpr3 = COPY [[UV3]](<2 x s16>)
; CHECK-NEXT: $vgpr4 = COPY [[UV4]](<2 x s16>)		; CHECK-NEXT: $vgpr4 = COPY [[UV4]](<2 x s16>)
; CHECK-NEXT: $vgpr5 = COPY [[UV5]](<2 x s16>)		; CHECK-NEXT: $vgpr5 = COPY [[UV5]](<2 x s16>)
; CHECK-NEXT: $vgpr6 = COPY [[UV6]](<2 x s16>)		; CHECK-NEXT: $vgpr6 = COPY [[UV6]](<2 x s16>)
; CHECK-NEXT: $vgpr7 = COPY [[UV7]](<2 x s16>)		; CHECK-NEXT: $vgpr7 = COPY [[UV7]](<2 x s16>)
; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5, implicit $vgpr6, implicit $vgpr7		; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2, implicit $vgpr3, implicit $vgpr4, implicit $vgpr5, implicit $vgpr6, implicit $vgpr7
%ptr = load volatile <16 x i16> addrspace(1), <16 x i16> addrspace(1) addrspace(4)* undef		%ptr = load volatile <16 x i16> addrspace(1), <16 x i16> addrspace(1) addrspace(4)* undef
%val = load <16 x i16>, <16 x i16> addrspace(1)* %ptr		%val = load <16 x i16>, <16 x i16> addrspace(1)* %ptr
ret <16 x i16> %val		ret <16 x i16> %val
}		}

define <16 x i8> @v16i8_func_void() #0 {		define <16 x i8> @v16i8_func_void() #0 {
; CHECK-LABEL: name: v16i8_func_void		; CHECK-LABEL: name: v16i8_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<16 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<16 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[LOAD]](p1) :: (load (<16 x s8>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[LOAD]](p1) :: (load (<16 x s8>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8), [[UV2:%[0-9]+]]:_(s8), [[UV3:%[0-9]+]]:_(s8), [[UV4:%[0-9]+]]:_(s8), [[UV5:%[0-9]+]]:_(s8), [[UV6:%[0-9]+]]:_(s8), [[UV7:%[0-9]+]]:_(s8), [[UV8:%[0-9]+]]:_(s8), [[UV9:%[0-9]+]]:_(s8), [[UV10:%[0-9]+]]:_(s8), [[UV11:%[0-9]+]]:_(s8), [[UV12:%[0-9]+]]:_(s8), [[UV13:%[0-9]+]]:_(s8), [[UV14:%[0-9]+]]:_(s8), [[UV15:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[LOAD1]](<16 x s8>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8), [[UV2:%[0-9]+]]:_(s8), [[UV3:%[0-9]+]]:_(s8), [[UV4:%[0-9]+]]:_(s8), [[UV5:%[0-9]+]]:_(s8), [[UV6:%[0-9]+]]:_(s8), [[UV7:%[0-9]+]]:_(s8), [[UV8:%[0-9]+]]:_(s8), [[UV9:%[0-9]+]]:_(s8), [[UV10:%[0-9]+]]:_(s8), [[UV11:%[0-9]+]]:_(s8), [[UV12:%[0-9]+]]:_(s8), [[UV13:%[0-9]+]]:_(s8), [[UV14:%[0-9]+]]:_(s8), [[UV15:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[LOAD1]](<16 x s8>)
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[UV]](s8)		; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[UV]](s8)
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s16) = G_ANYEXT [[UV1]](s8)		; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s16) = G_ANYEXT [[UV1]](s8)
; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s16) = G_ANYEXT [[UV2]](s8)		; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s16) = G_ANYEXT [[UV2]](s8)
; CHECK-NEXT: [[ANYEXT3:%[0-9]+]]:_(s16) = G_ANYEXT [[UV3]](s8)		; CHECK-NEXT: [[ANYEXT3:%[0-9]+]]:_(s16) = G_ANYEXT [[UV3]](s8)
; CHECK-NEXT: [[ANYEXT4:%[0-9]+]]:_(s16) = G_ANYEXT [[UV4]](s8)		; CHECK-NEXT: [[ANYEXT4:%[0-9]+]]:_(s16) = G_ANYEXT [[UV4]](s8)
; CHECK-NEXT: [[ANYEXT5:%[0-9]+]]:_(s16) = G_ANYEXT [[UV5]](s8)		; CHECK-NEXT: [[ANYEXT5:%[0-9]+]]:_(s16) = G_ANYEXT [[UV5]](s8)
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	define <3 x i8> @v3i8_func_void() #0 {
%val = load <3 x i8>, <3 x i8> addrspace(1)* undef		%val = load <3 x i8>, <3 x i8> addrspace(1)* undef
ret <3 x i8> %val		ret <3 x i8> %val
}		}

define <4 x i8> @v4i8_func_void() #0 {		define <4 x i8> @v4i8_func_void() #0 {
; CHECK-LABEL: name: v4i8_func_void		; CHECK-LABEL: name: v4i8_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<4 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<4 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[LOAD]](p1) :: (load (<4 x s8>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[LOAD]](p1) :: (load (<4 x s8>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8), [[UV2:%[0-9]+]]:_(s8), [[UV3:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[LOAD1]](<4 x s8>)		; CHECK-NEXT: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8), [[UV2:%[0-9]+]]:_(s8), [[UV3:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[LOAD1]](<4 x s8>)
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[UV]](s8)		; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[UV]](s8)
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s16) = G_ANYEXT [[UV1]](s8)		; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s16) = G_ANYEXT [[UV1]](s8)
; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s16) = G_ANYEXT [[UV2]](s8)		; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s16) = G_ANYEXT [[UV2]](s8)
; CHECK-NEXT: [[ANYEXT3:%[0-9]+]]:_(s16) = G_ANYEXT [[UV3]](s8)		; CHECK-NEXT: [[ANYEXT3:%[0-9]+]]:_(s16) = G_ANYEXT [[UV3]](s8)
; CHECK-NEXT: [[ANYEXT4:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)		; CHECK-NEXT: [[ANYEXT4:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)
; CHECK-NEXT: $vgpr0 = COPY [[ANYEXT4]](s32)		; CHECK-NEXT: $vgpr0 = COPY [[ANYEXT4]](s32)
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines

define <33 x i32> @v33i32_func_void() #0 {		define <33 x i32> @v33i32_func_void() #0 {
; CHECK-LABEL: name: v33i32_func_void		; CHECK-LABEL: name: v33i32_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: liveins: $vgpr0		; CHECK-NEXT: liveins: $vgpr0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `<33 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `<33 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<33 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<33 x s32>) from %ir.ptr, align 256, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<33 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<33 x s32>) from %ir.ptr, align 256, addrspace 1)
; CHECK-NEXT: G_STORE [[LOAD1]](<33 x s32>), [[COPY]](p5) :: (store (<33 x s32>), align 256, addrspace 5)		; CHECK-NEXT: G_STORE [[LOAD1]](<33 x s32>), [[COPY]](p5) :: (store (<33 x s32>), align 256, addrspace 5)
; CHECK-NEXT: SI_RETURN		; CHECK-NEXT: SI_RETURN
%ptr = load volatile <33 x i32> addrspace(1), <33 x i32> addrspace(1) addrspace(4)* undef		%ptr = load volatile <33 x i32> addrspace(1), <33 x i32> addrspace(1) addrspace(4)* undef
%val = load <33 x i32>, <33 x i32> addrspace(1)* %ptr		%val = load <33 x i32>, <33 x i32> addrspace(1)* %ptr
ret <33 x i32> %val		ret <33 x i32> %val
}		}

Show All 22 Lines

define { <32 x i32>, i32 } @struct_v32i32_i32_func_void() #0 {		define { <32 x i32>, i32 } @struct_v32i32_i32_func_void() #0 {
; CHECK-LABEL: name: struct_v32i32_i32_func_void		; CHECK-LABEL: name: struct_v32i32_i32_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: liveins: $vgpr0		; CHECK-NEXT: liveins: $vgpr0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `{ <32 x i32>, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `{ <32 x i32>, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<32 x s32>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: (load (<32 x s32>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 128		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 128
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr + 128, align 128, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr + 128, align 128, addrspace 1)
; CHECK-NEXT: G_STORE [[LOAD1]](<32 x s32>), [[COPY]](p5) :: (store (<32 x s32>), addrspace 5)		; CHECK-NEXT: G_STORE [[LOAD1]](<32 x s32>), [[COPY]](p5) :: (store (<32 x s32>), addrspace 5)
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 128		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 128
; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C1]](s32)		; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C1]](s32)
; CHECK-NEXT: G_STORE [[LOAD2]](s32), [[PTR_ADD1]](p5) :: (store (s32), align 128, addrspace 5)		; CHECK-NEXT: G_STORE [[LOAD2]](s32), [[PTR_ADD1]](p5) :: (store (s32), align 128, addrspace 5)
; CHECK-NEXT: SI_RETURN		; CHECK-NEXT: SI_RETURN
%ptr = load volatile { <32 x i32>, i32 } addrspace(1), { <32 x i32>, i32 } addrspace(1) addrspace(4)* undef		%ptr = load volatile { <32 x i32>, i32 } addrspace(1), { <32 x i32>, i32 } addrspace(1) addrspace(4)* undef
%val = load { <32 x i32>, i32 }, { <32 x i32>, i32 } addrspace(1)* %ptr		%val = load { <32 x i32>, i32 }, { <32 x i32>, i32 } addrspace(1)* %ptr
ret { <32 x i32>, i32 }%val		ret { <32 x i32>, i32 }%val
}		}

define { i32, <32 x i32> } @struct_i32_v32i32_func_void() #0 {		define { i32, <32 x i32> } @struct_i32_v32i32_func_void() #0 {
; CHECK-LABEL: name: struct_i32_v32i32_func_void		; CHECK-LABEL: name: struct_i32_v32i32_func_void
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: liveins: $vgpr0		; CHECK-NEXT: liveins: $vgpr0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile load (p1) from `{ i32, <32 x i32> } addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (volatile dereferenceable invariant load (p1) from `{ i32, <32 x i32> } addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[LOAD]](p1) :: (load (s32) from %ir.ptr, align 128, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[LOAD]](p1) :: (load (s32) from %ir.ptr, align 128, addrspace 1)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 128		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 128
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[PTR_ADD]](p1) :: (load (<32 x s32>) from %ir.ptr + 128, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[PTR_ADD]](p1) :: (load (<32 x s32>) from %ir.ptr + 128, addrspace 1)
; CHECK-NEXT: G_STORE [[LOAD1]](s32), [[COPY]](p5) :: (store (s32), align 128, addrspace 5)		; CHECK-NEXT: G_STORE [[LOAD1]](s32), [[COPY]](p5) :: (store (s32), align 128, addrspace 5)
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 128		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 128
; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C1]](s32)		; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C1]](s32)
; CHECK-NEXT: G_STORE [[LOAD2]](<32 x s32>), [[PTR_ADD1]](p5) :: (store (<32 x s32>), addrspace 5)		; CHECK-NEXT: G_STORE [[LOAD2]](<32 x s32>), [[PTR_ADD1]](p5) :: (store (<32 x s32>), addrspace 5)
▲ Show 20 Lines • Show All 275 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_vs.ll

	Show All 29 Lines
	define amdgpu_vs void @test_ptr2_inreg(i32 addrspace(4)* inreg %arg0) {			define amdgpu_vs void @test_ptr2_inreg(i32 addrspace(4)* inreg %arg0) {
	; CHECK-LABEL: name: test_ptr2_inreg			; CHECK-LABEL: name: test_ptr2_inreg
	; CHECK: bb.1 (%ir-block.0):			; CHECK: bb.1 (%ir-block.0):
	; CHECK-NEXT: liveins: $sgpr2, $sgpr3			; CHECK-NEXT: liveins: $sgpr2, $sgpr3
	; CHECK-NEXT: {{ $}}			; CHECK-NEXT: {{ $}}
	; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $sgpr2			; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $sgpr2
	; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $sgpr3			; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $sgpr3
	; CHECK-NEXT: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[COPY]](s32), [[COPY1]](s32)			; CHECK-NEXT: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[COPY]](s32), [[COPY1]](s32)
	; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[MV]](p4) :: (volatile load (s32) from %ir.arg0, addrspace 4)			; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[MV]](p4) :: (volatile dereferenceable invariant load (s32) from %ir.arg0, addrspace 4)
	; CHECK-NEXT: S_ENDPGM 0			; CHECK-NEXT: S_ENDPGM 0
	%tmp0 = load volatile i32, i32 addrspace(4)* %arg0			%tmp0 = load volatile i32, i32 addrspace(4)* %arg0
	ret void			ret void
	}			}

	define amdgpu_vs void @test_sgpr_alignment0(float inreg %arg0, i32 addrspace(4)* inreg %arg1) {			define amdgpu_vs void @test_sgpr_alignment0(float inreg %arg0, i32 addrspace(4)* inreg %arg1) {
	; CHECK-LABEL: name: test_sgpr_alignment0			; CHECK-LABEL: name: test_sgpr_alignment0
	; CHECK: bb.1 (%ir-block.0):			; CHECK: bb.1 (%ir-block.0):
	; CHECK-NEXT: liveins: $sgpr2, $sgpr3, $sgpr4			; CHECK-NEXT: liveins: $sgpr2, $sgpr3, $sgpr4
	; CHECK-NEXT: {{ $}}			; CHECK-NEXT: {{ $}}
	; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $sgpr2			; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $sgpr2
	; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $sgpr3			; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $sgpr3
	; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $sgpr4			; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $sgpr4
	; CHECK-NEXT: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)			; CHECK-NEXT: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)
	; CHECK-NEXT: [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF			; CHECK-NEXT: [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
	; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[MV]](p4) :: (volatile load (s32) from %ir.arg1, addrspace 4)			; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[MV]](p4) :: (volatile dereferenceable invariant load (s32) from %ir.arg1, addrspace 4)
	; CHECK-NEXT: G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.exp), 32, 15, [[COPY]](s32), [[DEF]](s32), [[DEF]](s32), [[DEF]](s32), 0, 0			; CHECK-NEXT: G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.exp), 32, 15, [[COPY]](s32), [[DEF]](s32), [[DEF]](s32), [[DEF]](s32), 0, 0
	; CHECK-NEXT: S_ENDPGM 0			; CHECK-NEXT: S_ENDPGM 0
	%tmp0 = load volatile i32, i32 addrspace(4)* %arg1			%tmp0 = load volatile i32, i32 addrspace(4)* %arg1
	call void @llvm.amdgcn.exp.f32(i32 32, i32 15, float %arg0, float undef, float undef, float undef, i1 false, i1 false) #0			call void @llvm.amdgcn.exp.f32(i32 32, i32 15, float %arg0, float undef, float undef, float undef, i1 false, i1 false) #0
	ret void			ret void
	}			}

	define amdgpu_vs void @test_order(float inreg %arg0, float inreg %arg1, float %arg2, float %arg3) {			define amdgpu_vs void @test_order(float inreg %arg0, float inreg %arg1, float %arg2, float %arg3) {
	▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-non-fixed.ll

Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	define amdgpu_gfx void @test_gfx_call_external_void_func_i32_imm_inreg(i32 inreg) #0 {
call amdgpu_gfx void @external_gfx_void_func_i32_inreg(i32 inreg 42)		call amdgpu_gfx void @external_gfx_void_func_i32_inreg(i32 inreg 42)
ret void		ret void
}		}

define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32() #0 {		define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32() #0 {
; CHECK-LABEL: name: test_gfx_call_external_void_func_struct_i8_i32		; CHECK-LABEL: name: test_gfx_call_external_void_func_struct_i8_i32
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: (load (s8) from %ir.ptr0, align 4, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: (load (s8) from %ir.ptr0, align 4, addrspace 1)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr0 + 4, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr0 + 4, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_gfx_void_func_struct_i8_i32		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_gfx_void_func_struct_i8_i32
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[LOAD1]](s8)		; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[LOAD1]](s8)
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)		; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)
Show All 9 Lines	define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32() #0 {
call amdgpu_gfx void @external_gfx_void_func_struct_i8_i32({ i8, i32 } %val)		call amdgpu_gfx void @external_gfx_void_func_struct_i8_i32({ i8, i32 } %val)
ret void		ret void
}		}

define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32_inreg() #0 {		define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32_inreg() #0 {
; CHECK-LABEL: name: test_gfx_call_external_void_func_struct_i8_i32_inreg		; CHECK-LABEL: name: test_gfx_call_external_void_func_struct_i8_i32_inreg
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: (load (s8) from %ir.ptr0, align 4, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: (load (s8) from %ir.ptr0, align 4, addrspace 1)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr0 + 4, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr0 + 4, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_gfx_void_func_struct_i8_i32_inreg		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_gfx_void_func_struct_i8_i32_inreg
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[LOAD1]](s8)		; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[LOAD1]](s8)
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)		; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)
Show All 16 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,196 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v8i32() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<8 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<8 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<8 x s32>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<8 x s32>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v8i32		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v8i32
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)
▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v16i32() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<16 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<16 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<16 x s32>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<16 x s32>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v16i32		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v16i32
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v32i32() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<32 x s32>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<32 x s32>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v32i32		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v32i32
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v32i32_i32(i32) #0 {
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF
; CHECK-NEXT: [[INT:%[0-9]+]]:_(p4) = G_INTRINSIC intrinsic(@llvm.amdgcn.kernarg.segment.ptr)		; CHECK-NEXT: [[INT:%[0-9]+]]:_(p4) = G_INTRINSIC intrinsic(@llvm.amdgcn.kernarg.segment.ptr)
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<32 x s32>) from %ir.ptr0, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<32 x s32>) from %ir.ptr0, addrspace 1)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[DEF1]](p1) :: ("amdgpu-noclobber" load (s32) from `i32 addrspace(1)* undef`, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[DEF1]](p1) :: ("amdgpu-noclobber" load (s32) from `i32 addrspace(1)* undef`, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v32i32_i32		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v32i32_i32
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v32i32_i8_i8_i16() #0 {
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p1) = COPY [[DEF1]](p1)		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p1) = COPY [[DEF1]](p1)
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<32 x s32>) from %ir.ptr0, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<32 x s32>) from %ir.ptr0, addrspace 1)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s8) = G_LOAD [[DEF1]](p1) :: ("amdgpu-noclobber" load (s8) from `i8 addrspace(1)* undef`, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s8) = G_LOAD [[DEF1]](p1) :: ("amdgpu-noclobber" load (s8) from `i8 addrspace(1)* undef`, addrspace 1)
; CHECK-NEXT: [[LOAD3:%[0-9]+]]:_(s16) = G_LOAD [[COPY10]](p1) :: ("amdgpu-noclobber" load (s16) from `i16 addrspace(1)* undef`, addrspace 1)		; CHECK-NEXT: [[LOAD3:%[0-9]+]]:_(s16) = G_LOAD [[COPY10]](p1) :: ("amdgpu-noclobber" load (s16) from `i16 addrspace(1)* undef`, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v32i32_i8_i8_i16		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v32i32_i8_i8_i16
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY13:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY13:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v32i32_p3_p5() #0 {
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF1:%[0-9]+]]:_(p1) = G_IMPLICIT_DEF
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p1) = COPY [[DEF1]](p1)		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p1) = COPY [[DEF1]](p1)
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<32 x i32> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<32 x s32>) from %ir.ptr0, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<32 x s32>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<32 x s32>) from %ir.ptr0, addrspace 1)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(p3) = G_LOAD [[DEF1]](p1) :: ("amdgpu-noclobber" load (p3) from `i8 addrspace(3)* addrspace(1)* undef`, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(p3) = G_LOAD [[DEF1]](p1) :: ("amdgpu-noclobber" load (p3) from `i8 addrspace(3)* addrspace(1)* undef`, addrspace 1)
; CHECK-NEXT: [[LOAD3:%[0-9]+]]:_(p5) = G_LOAD [[COPY10]](p1) :: ("amdgpu-noclobber" load (p5) from `i8 addrspace(5)* addrspace(1)* undef`, addrspace 1)		; CHECK-NEXT: [[LOAD3:%[0-9]+]]:_(p5) = G_LOAD [[COPY10]](p1) :: ("amdgpu-noclobber" load (p5) from `i8 addrspace(5)* addrspace(1)* undef`, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v32i32_p3_p5		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v32i32_p3_p5
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY13:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY13:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_struct_i8_i32() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (s8) from %ir.ptr0, align 4, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (s8) from %ir.ptr0, align 4, addrspace 1)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: ("amdgpu-noclobber" load (s32) from %ir.ptr0 + 4, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: ("amdgpu-noclobber" load (s32) from %ir.ptr0 + 4, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_struct_i8_i32		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_struct_i8_i32
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
Show All 35 Lines	define amdgpu_kernel void @test_call_external_void_func_struct_i8_i32() #0 {
call void @external_void_func_struct_i8_i32({ i8, i32 } %val)		call void @external_void_func_struct_i8_i32({ i8, i32 } %val)
ret void		ret void
}		}

define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32() #0 {		define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32() #0 {
; CHECK-LABEL: name: test_gfx_call_external_void_func_struct_i8_i32		; CHECK-LABEL: name: test_gfx_call_external_void_func_struct_i8_i32
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: (load (s8) from %ir.ptr0, align 4, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: (load (s8) from %ir.ptr0, align 4, addrspace 1)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr0 + 4, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr0 + 4, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_gfx_void_func_struct_i8_i32		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_gfx_void_func_struct_i8_i32
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[LOAD1]](s8)		; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[LOAD1]](s8)
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)		; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)
Show All 9 Lines	define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32() #0 {
call amdgpu_gfx void @external_gfx_void_func_struct_i8_i32({ i8, i32 } %val)		call amdgpu_gfx void @external_gfx_void_func_struct_i8_i32({ i8, i32 } %val)
ret void		ret void
}		}

define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32_inreg() #0 {		define amdgpu_gfx void @test_gfx_call_external_void_func_struct_i8_i32_inreg() #0 {
; CHECK-LABEL: name: test_gfx_call_external_void_func_struct_i8_i32_inreg		; CHECK-LABEL: name: test_gfx_call_external_void_func_struct_i8_i32_inreg
; CHECK: bb.1 (%ir-block.0):		; CHECK: bb.1 (%ir-block.0):
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `{ i8, i32 } addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: (load (s8) from %ir.ptr0, align 4, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s8) = G_LOAD [[LOAD]](p1) :: (load (s8) from %ir.ptr0, align 4, addrspace 1)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[LOAD]], [[C]](s64)
; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr0 + 4, addrspace 1)		; CHECK-NEXT: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load (s32) from %ir.ptr0 + 4, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_gfx_void_func_struct_i8_i32_inreg		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_gfx_void_func_struct_i8_i32_inreg
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[LOAD1]](s8)		; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s16) = G_ANYEXT [[LOAD1]](s8)
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)		; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[ANYEXT]](s16)
▲ Show 20 Lines • Show All 195 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v2i8() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<2 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<2 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<2 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<2 x s8>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<2 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<2 x s8>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v2i8		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v2i8
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v3i8() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<3 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<3 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<3 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<3 x s8>) from %ir.ptr, align 4, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<3 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<3 x s8>) from %ir.ptr, align 4, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v3i8		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v3i8
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v4i8() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<4 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<4 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<4 x s8>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<4 x s8>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v4i8		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v4i8
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v8i8() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<8 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<8 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<8 x s8>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<8 x s8>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v8i8		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v8i8
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	define amdgpu_kernel void @test_call_external_void_func_v16i8() #0 {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16		; CHECK-NEXT: [[COPY3:%[0-9]+]]:sgpr_32 = COPY $sgpr16
; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15		; CHECK-NEXT: [[COPY4:%[0-9]+]]:sgpr_32 = COPY $sgpr15
; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14		; CHECK-NEXT: [[COPY5:%[0-9]+]]:sgpr_32 = COPY $sgpr14
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_64 = COPY $sgpr10_sgpr11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_64 = COPY $sgpr6_sgpr7
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_64 = COPY $sgpr4_sgpr5
; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9		; CHECK-NEXT: [[COPY9:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF		; CHECK-NEXT: [[DEF:%[0-9]+]]:_(p4) = G_IMPLICIT_DEF
; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (load (p1) from `<16 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(p1) = G_LOAD [[DEF]](p4) :: (dereferenceable invariant load (p1) from `<16 x i8> addrspace(1)* addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<16 x s8>) from %ir.ptr, addrspace 1)		; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[LOAD]](p1) :: ("amdgpu-noclobber" load (<16 x s8>) from %ir.ptr, addrspace 1)
; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc		; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def $scc
; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v16i8		; CHECK-NEXT: [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @external_void_func_v16i8
; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]		; CHECK-NEXT: [[COPY10:%[0-9]+]]:_(p4) = COPY [[COPY8]]
; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]		; CHECK-NEXT: [[COPY11:%[0-9]+]]:_(p4) = COPY [[COPY7]]
; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)		; CHECK-NEXT: [[COPY12:%[0-9]+]]:_(p4) = COPY [[COPY9]](p4)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY12]], [[C]](s64)
▲ Show 20 Lines • Show All 778 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-invariant.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
				; RUN: llc -simplify-mir -global-isel -march=amdgcn -stop-after=irtranslator -verify-machineinstrs %s -o - \| FileCheck %s

				; Check the flags set on the memory operands for loads determined to
				; be constants by alias analysis.

				@const_gv0 = external addrspace(1) constant i32, align 4
				@const_gv1 = external addrspace(1) constant i32, align 4
				@const_struct_gv = external addrspace(1) constant { i32, i64 }, align 8

				define i32 @load_const_i32_gv() {
				; CHECK-LABEL: name: load_const_i32_gv
				; CHECK: bb.1 (%ir-block.0):
				; CHECK-NEXT: [[GV:%[0-9]+]]:_(p1) = G_GLOBAL_VALUE @const_gv0
				; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[GV]](p1) :: (dereferenceable invariant load (s32) from @const_gv0, addrspace 1)
				; CHECK-NEXT: $vgpr0 = COPY [[LOAD]](s32)
				; CHECK-NEXT: SI_RETURN implicit $vgpr0
				%load = load i32, ptr addrspace(1) @const_gv0, align 4
				ret i32 %load
				}

				define i32 @load_select_const_i32_gv(i1 %cond) {
				; CHECK-LABEL: name: load_select_const_i32_gv
				; CHECK: bb.1 (%ir-block.0):
				; CHECK-NEXT: liveins: $vgpr0
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
				; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s1) = G_TRUNC [[COPY]](s32)
				; CHECK-NEXT: [[GV:%[0-9]+]]:_(p1) = G_GLOBAL_VALUE @const_gv0
				; CHECK-NEXT: [[GV1:%[0-9]+]]:_(p1) = G_GLOBAL_VALUE @const_gv1
				; CHECK-NEXT: [[SELECT:%[0-9]+]]:_(p1) = G_SELECT [[TRUNC]](s1), [[GV]], [[GV1]]
				; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[SELECT]](p1) :: (dereferenceable invariant load (s32) from %ir.select, addrspace 1)
				; CHECK-NEXT: $vgpr0 = COPY [[LOAD]](s32)
				; CHECK-NEXT: SI_RETURN implicit $vgpr0
				%select = select i1 %cond, ptr addrspace(1) @const_gv0, ptr addrspace(1) @const_gv1
				%load = load i32, ptr addrspace(1) %select, align 4
				ret i32 %load
				}

				define { i32, i64 } @load_const_struct_gv() {
				; CHECK-LABEL: name: load_const_struct_gv
				; CHECK: bb.1 (%ir-block.0):
				; CHECK-NEXT: [[GV:%[0-9]+]]:_(p1) = G_GLOBAL_VALUE @const_struct_gv
				; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[GV]](p1) :: (dereferenceable invariant load (s32) from @const_struct_gv, align 8, addrspace 1)
				; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
				; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[GV]], [[C]](s64)
				; CHECK-NEXT: [[LOAD1:%[0-9]+]]:_(s64) = G_LOAD [[PTR_ADD]](p1) :: (dereferenceable invariant load (s64) from @const_struct_gv + 8, addrspace 1)
				; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[LOAD1]](s64)
				; CHECK-NEXT: $vgpr0 = COPY [[LOAD]](s32)
				; CHECK-NEXT: $vgpr1 = COPY [[UV]](s32)
				; CHECK-NEXT: $vgpr2 = COPY [[UV1]](s32)
				; CHECK-NEXT: SI_RETURN implicit $vgpr0, implicit $vgpr1, implicit $vgpr2
				%load = load { i32, i64 }, ptr addrspace(1) @const_struct_gv, align 8
				ret { i32, i64 } %load
				}

				define void @test_memcpy_p1_constaddr_i64(i8 addrspace(1)* %dst, i8 addrspace(4)* %src) {
				; CHECK-LABEL: name: test_memcpy_p1_constaddr_i64
				; CHECK: bb.1 (%ir-block.0):
				; CHECK-NEXT: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
				; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1
				; CHECK-NEXT: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[COPY]](s32), [[COPY1]](s32)
				; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2
				; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3
				; CHECK-NEXT: [[MV1:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[COPY2]](s32), [[COPY3]](s32)
				; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 32
				; CHECK-NEXT: G_MEMCPY [[MV]](p1), [[MV1]](p4), [[C]](s64), 0 :: (store (s8) into %ir.dst, addrspace 1), (dereferenceable invariant load (s8) from %ir.src, addrspace 4)
				; CHECK-NEXT: SI_RETURN
				call void @llvm.memcpy.p1.p4.i64(i8 addrspace(1)* %dst, i8 addrspace(4)* %src, i64 32, i1 false)
				ret void
				}

				define void @test_memcpy_inline_p1_constaddr_i64(i8 addrspace(1)* %dst, i8 addrspace(4)* %src) {
				; CHECK-LABEL: name: test_memcpy_inline_p1_constaddr_i64
				; CHECK: bb.1 (%ir-block.0):
				; CHECK-NEXT: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
				; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1
				; CHECK-NEXT: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[COPY]](s32), [[COPY1]](s32)
				; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2
				; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3
				; CHECK-NEXT: [[MV1:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[COPY2]](s32), [[COPY3]](s32)
				; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 32
				; CHECK-NEXT: G_MEMCPY_INLINE [[MV]](p1), [[MV1]](p4), [[C]](s64) :: (store (s8) into %ir.dst, addrspace 1), (dereferenceable invariant load (s8) from %ir.src, addrspace 4)
				; CHECK-NEXT: SI_RETURN
				call void @llvm.memcpy.inline.p1.p4.i64(i8 addrspace(1)* %dst, i8 addrspace(4)* %src, i64 32, i1 false)
				ret void
				}

				define void @test_memmove_p1_constaddr_i64(i8 addrspace(1)* %dst, i8 addrspace(4)* %src) {
				; CHECK-LABEL: name: test_memmove_p1_constaddr_i64
				; CHECK: bb.1 (%ir-block.0):
				; CHECK-NEXT: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
				; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1
				; CHECK-NEXT: [[MV:%[0-9]+]]:_(p1) = G_MERGE_VALUES [[COPY]](s32), [[COPY1]](s32)
				; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2
				; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY $vgpr3
				; CHECK-NEXT: [[MV1:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[COPY2]](s32), [[COPY3]](s32)
				; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 32
				; CHECK-NEXT: G_MEMMOVE [[MV]](p1), [[MV1]](p4), [[C]](s64), 0 :: (store (s8) into %ir.dst, addrspace 1), (dereferenceable invariant load (s8) from %ir.src, addrspace 4)
				; CHECK-NEXT: SI_RETURN
				call void @llvm.memmove.p1.p4.i64(i8 addrspace(1)* %dst, i8 addrspace(4)* %src, i64 32, i1 false)
				ret void
				}

				declare void @llvm.memcpy.p1.p4.i64(ptr addrspace(1) noalias nocapture writeonly, ptr addrspace(4) noalias nocapture readonly, i64, i1 immarg) #0
				declare void @llvm.memcpy.inline.p1.p4.i64(ptr addrspace(1) noalias nocapture writeonly, ptr addrspace(4) noalias nocapture readonly, i64, i1 immarg) #0
				declare void @llvm.memmove.p1.p4.i64(ptr addrspace(1) nocapture writeonly, ptr addrspace(4) nocapture readonly, i64, i1 immarg) #0

				attributes #0 = { argmemonly nofree nounwind willreturn }

llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll

	; Test that DAG->DAG ISel is able to pick up the S_LOAD_DWORDX4_SGPR instruction that fetches the offset			; Test that DAG->DAG ISel is able to pick up the S_LOAD_DWORDX4_SGPR instruction that fetches the offset
	; from a register.			; from a register.

	; RUN: llc -march=amdgcn -verify-machineinstrs -stop-after=amdgpu-isel -o - %s \| FileCheck -check-prefix=GCN %s			; RUN: llc -march=amdgcn -verify-machineinstrs -stop-after=amdgpu-isel -o - %s \| FileCheck -check-prefix=GCN %s

	; GCN: %[[OFFSET:[0-9]+]]:sreg_32 = S_MOV_B32 target-flags(amdgpu-abs32-lo) @DescriptorBuffer			; GCN: %[[OFFSET:[0-9]+]]:sreg_32 = S_MOV_B32 target-flags(amdgpu-abs32-lo) @DescriptorBuffer
	; GCN: %{{[0-9]+}}:sgpr_128 = S_LOAD_DWORDX4_SGPR killed %{{[0-9]+}}, killed %[[OFFSET]], 0 :: (invariant load (s128) from %ir.13, addrspace 4)			; GCN: %{{[0-9]+}}:sgpr_128 = S_LOAD_DWORDX4_SGPR killed %{{[0-9]+}}, killed %[[OFFSET]], 0 :: (dereferenceable invariant load (s128) from %ir.13, addrspace 4)

	define amdgpu_cs void @test_load_zext(i32 inreg %0, i32 inreg %1, i32 inreg %resNode0, i32 inreg %resNode1, <3 x i32> inreg %2, i32 inreg %3, <3 x i32> %4) local_unnamed_addr #2 {			define amdgpu_cs void @test_load_zext(i32 inreg %0, i32 inreg %1, i32 inreg %resNode0, i32 inreg %resNode1, <3 x i32> inreg %2, i32 inreg %3, <3 x i32> %4) local_unnamed_addr #2 {
	.entry:			.entry:
	%5 = call i64 @llvm.amdgcn.s.getpc() #3			%5 = call i64 @llvm.amdgcn.s.getpc() #3
	%6 = bitcast i64 %5 to <2 x i32>			%6 = bitcast i64 %5 to <2 x i32>
	%7 = insertelement <2 x i32> %6, i32 %resNode0, i32 0			%7 = insertelement <2 x i32> %6, i32 %resNode0, i32 0
	%8 = bitcast <2 x i32> %7 to i64			%8 = bitcast <2 x i32> %7 to i64
	%9 = inttoptr i64 %8 to [4294967295 x i8] addrspace(4)*			%9 = inttoptr i64 %8 to [4294967295 x i8] addrspace(4)*
	▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/llc-pipeline.ll

	Show First 20 Lines • Show All 96 Lines • ▼ Show 20 Lines
	; GCN-O0-NEXT: MachinePostDominator Tree Construction			; GCN-O0-NEXT: MachinePostDominator Tree Construction
	; GCN-O0-NEXT: SI Lower i1 Copies			; GCN-O0-NEXT: SI Lower i1 Copies
	; GCN-O0-NEXT: Finalize ISel and expand pseudo-instructions			; GCN-O0-NEXT: Finalize ISel and expand pseudo-instructions
	; GCN-O0-NEXT: Local Stack Slot Allocation			; GCN-O0-NEXT: Local Stack Slot Allocation
	; GCN-O0-NEXT: Register Usage Information Propagation			; GCN-O0-NEXT: Register Usage Information Propagation
	; GCN-O0-NEXT: Eliminate PHI nodes for register allocation			; GCN-O0-NEXT: Eliminate PHI nodes for register allocation
	; GCN-O0-NEXT: SI Lower control flow pseudo instructions			; GCN-O0-NEXT: SI Lower control flow pseudo instructions
	; GCN-O0-NEXT: Two-Address instruction pass			; GCN-O0-NEXT: Two-Address instruction pass
	; GCN-O0-NEXT: Basic Alias Analysis (stateless AA impl)
	; GCN-O0-NEXT: Function Alias Analysis Results
	; GCN-O0-NEXT: MachineDominator Tree Construction			; GCN-O0-NEXT: MachineDominator Tree Construction
	; GCN-O0-NEXT: Slot index numbering			; GCN-O0-NEXT: Slot index numbering
	; GCN-O0-NEXT: Live Interval Analysis			; GCN-O0-NEXT: Live Interval Analysis
	; GCN-O0-NEXT: MachinePostDominator Tree Construction			; GCN-O0-NEXT: MachinePostDominator Tree Construction
	; GCN-O0-NEXT: SI Whole Quad Mode			; GCN-O0-NEXT: SI Whole Quad Mode
	; GCN-O0-NEXT: Virtual Register Map			; GCN-O0-NEXT: Virtual Register Map
	; GCN-O0-NEXT: Live Register Matrix			; GCN-O0-NEXT: Live Register Matrix
	; GCN-O0-NEXT: SI Pre-allocate WWM Registers			; GCN-O0-NEXT: SI Pre-allocate WWM Registers
	▲ Show 20 Lines • Show All 1,172 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll

Show All 24 Lines	define amdgpu_gs void @_amdgpu_gs_main(i32 inreg %primShaderTableAddrLow, <31 x i32> inreg %userData) {
; CHECK-NEXT: undef %50.sub0:sgpr_64 = COPY $sgpr19		; CHECK-NEXT: undef %50.sub0:sgpr_64 = COPY $sgpr19
; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_32 = COPY $sgpr20		; CHECK-NEXT: [[COPY6:%[0-9]+]]:sgpr_32 = COPY $sgpr20
; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_32 = COPY $sgpr21		; CHECK-NEXT: [[COPY7:%[0-9]+]]:sgpr_32 = COPY $sgpr21
; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_32 = COPY $sgpr22		; CHECK-NEXT: [[COPY8:%[0-9]+]]:sgpr_32 = COPY $sgpr22
; CHECK-NEXT: [[COPY9:%[0-9]+]]:sgpr_32 = COPY $sgpr23		; CHECK-NEXT: [[COPY9:%[0-9]+]]:sgpr_32 = COPY $sgpr23
; CHECK-NEXT: [[COPY10:%[0-9]+]]:sgpr_32 = COPY $sgpr9		; CHECK-NEXT: [[COPY10:%[0-9]+]]:sgpr_32 = COPY $sgpr9
; CHECK-NEXT: [[COPY11:%[0-9]+]]:sgpr_32 = COPY $sgpr10		; CHECK-NEXT: [[COPY11:%[0-9]+]]:sgpr_32 = COPY $sgpr10
; CHECK-NEXT: [[COPY12:%[0-9]+]]:sgpr_32 = COPY $sgpr8		; CHECK-NEXT: [[COPY12:%[0-9]+]]:sgpr_32 = COPY $sgpr8
; CHECK-NEXT: undef %71.sub0_sub1:sgpr_128 = S_LOAD_DWORDX2_IMM %56, 232, 0 :: (load (s64) from %ir.40, addrspace 4)		; CHECK-NEXT: undef %71.sub0_sub1:sgpr_128 = S_LOAD_DWORDX2_IMM %56, 232, 0 :: (dereferenceable invariant load (s64) from %ir.40, addrspace 4)
; CHECK-NEXT: [[S_LSHL_B32_:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY4]], 4, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY4]], 4, implicit-def dead $scc
; CHECK-NEXT: [[S_LSHL_B32_1:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY3]], 4, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_1:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY3]], 4, implicit-def dead $scc
; CHECK-NEXT: [[S_LSHL_B32_2:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY2]], 4, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_2:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY2]], 4, implicit-def dead $scc
; CHECK-NEXT: [[S_ASHR_I32_:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_]], 31, implicit-def dead $scc
; CHECK-NEXT: [[S_ASHR_I32_1:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_1]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_1:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_1]], 31, implicit-def dead $scc
; CHECK-NEXT: [[S_ASHR_I32_2:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_2]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_2:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_2]], 31, implicit-def dead $scc
; CHECK-NEXT: %71.sub1:sgpr_128 = S_AND_B32 %71.sub1, 65535, implicit-def dead $scc		; CHECK-NEXT: %71.sub1:sgpr_128 = S_AND_B32 %71.sub1, 65535, implicit-def dead $scc
; CHECK-NEXT: undef %130.sub0:sreg_64 = S_ADD_U32 [[COPY5]], [[S_LSHL_B32_2]], implicit-def $scc		; CHECK-NEXT: undef %130.sub0:sreg_64 = S_ADD_U32 [[COPY5]], [[S_LSHL_B32_2]], implicit-def $scc
; CHECK-NEXT: %130.sub1:sreg_64 = S_ADDC_U32 undef %54:sreg_32, [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %130.sub1:sreg_64 = S_ADDC_U32 undef %54:sreg_32, [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %130, 16, 0 :: (load (s128) from %ir.84, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %130, 16, 0 :: (dereferenceable invariant load (s128) from %ir.84, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM1:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM undef %74:sreg_64, 0, 0 :: (load (s128) from `<4 x i32> addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM1:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM undef %74:sreg_64, 0, 0 :: (dereferenceable invariant load (s128) from `<4 x i32> addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM undef %132:sgpr_128, 0, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM undef %132:sgpr_128, 0, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: KILL undef %74:sreg_64		; CHECK-NEXT: KILL undef %74:sreg_64
; CHECK-NEXT: KILL undef %132:sgpr_128		; CHECK-NEXT: KILL undef %132:sgpr_128
; CHECK-NEXT: KILL %130.sub0, %130.sub1		; CHECK-NEXT: KILL %130.sub0, %130.sub1
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM1:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[S_LOAD_DWORDX4_IMM]], 0, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM1:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[S_LOAD_DWORDX4_IMM]], 0, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec		; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; CHECK-NEXT: undef %302.sub1:sgpr_128 = S_MOV_B32 0		; CHECK-NEXT: undef %302.sub1:sgpr_128 = S_MOV_B32 0
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], undef %89:sgpr_128, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], undef %89:sgpr_128, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN1:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM1]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN1:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM1]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: KILL undef %89:sgpr_128		; CHECK-NEXT: KILL undef %89:sgpr_128
; CHECK-NEXT: [[S_SUB_I32_:%[0-9]+]]:sreg_32 = S_SUB_I32 [[S_BUFFER_LOAD_DWORD_IMM]], 29, implicit-def dead $scc		; CHECK-NEXT: [[S_SUB_I32_:%[0-9]+]]:sreg_32 = S_SUB_I32 [[S_BUFFER_LOAD_DWORD_IMM]], 29, implicit-def dead $scc
; CHECK-NEXT: [[S_SUB_I32_1:%[0-9]+]]:sreg_32 = S_SUB_I32 [[S_BUFFER_LOAD_DWORD_IMM]], 30, implicit-def dead $scc		; CHECK-NEXT: [[S_SUB_I32_1:%[0-9]+]]:sreg_32 = S_SUB_I32 [[S_BUFFER_LOAD_DWORD_IMM]], 30, implicit-def dead $scc
; CHECK-NEXT: [[S_SUB_I32_2:%[0-9]+]]:sreg_32 = S_SUB_I32 [[S_BUFFER_LOAD_DWORD_IMM1]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_SUB_I32_2:%[0-9]+]]:sreg_32 = S_SUB_I32 [[S_BUFFER_LOAD_DWORD_IMM1]], 31, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY5]], 64, implicit-def $scc		; CHECK-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY5]], 64, implicit-def $scc
; CHECK-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 undef %54:sreg_32, 0, implicit-def dead $scc, implicit $scc		; CHECK-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 undef %54:sreg_32, 0, implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %149.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_]], [[S_LSHL_B32_]], implicit-def $scc		; CHECK-NEXT: undef %149.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_]], [[S_LSHL_B32_]], implicit-def $scc
; CHECK-NEXT: %149.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_]], [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %149.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_]], [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %156.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_]], [[S_LSHL_B32_1]], implicit-def $scc		; CHECK-NEXT: undef %156.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_]], [[S_LSHL_B32_1]], implicit-def $scc
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM2:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %149, 0, 0 :: (load (s128) from %ir.91, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM2:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %149, 0, 0 :: (dereferenceable invariant load (s128) from %ir.91, addrspace 4)
; CHECK-NEXT: %156.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_]], [[S_ASHR_I32_1]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %156.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_]], [[S_ASHR_I32_1]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %163.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_]], [[S_LSHL_B32_2]], implicit-def $scc		; CHECK-NEXT: undef %163.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_]], [[S_LSHL_B32_2]], implicit-def $scc
; CHECK-NEXT: %163.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_]], [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %163.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_]], [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_ASHR_I32_3:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 undef %171:sreg_32, 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_3:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 undef %171:sreg_32, 31, implicit-def dead $scc
; CHECK-NEXT: undef %176.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_]], undef %171:sreg_32, implicit-def $scc		; CHECK-NEXT: undef %176.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_]], undef %171:sreg_32, implicit-def $scc
; CHECK-NEXT: %176.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_]], [[S_ASHR_I32_3]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %176.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_]], [[S_ASHR_I32_3]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %183.sub0:sreg_64 = S_ADD_U32 %50.sub0, [[S_LSHL_B32_]], implicit-def $scc		; CHECK-NEXT: undef %183.sub0:sreg_64 = S_ADD_U32 %50.sub0, [[S_LSHL_B32_]], implicit-def $scc
; CHECK-NEXT: %183.sub1:sreg_64 = S_ADDC_U32 undef %51:sreg_32, [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %183.sub1:sreg_64 = S_ADDC_U32 undef %51:sreg_32, [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc
Show All 27 Lines	define amdgpu_gs void @_amdgpu_gs_main(i32 inreg %primShaderTableAddrLow, <31 x i32> inreg %userData) {
; CHECK-NEXT: [[S_ADD_I32_1:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_LSHL_B32_2]], 16, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_1:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_LSHL_B32_2]], 16, implicit-def dead $scc
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR %302, [[S_ADD_I32_]], 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR %302, [[S_ADD_I32_]], 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR1:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR %302, undef %314:sreg_32, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR1:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR %302, undef %314:sreg_32, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR2:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR %302, [[S_ADD_I32_1]], 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR2:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR %302, [[S_ADD_I32_1]], 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM2:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %302, 16, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM2:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %302, 16, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[BUFFER_LOAD_DWORD_OFFSET:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_OFFSET undef %118:sgpr_128, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_DWORD_OFFSET:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_OFFSET undef %118:sgpr_128, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR3:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR undef %369:sgpr_128, undef %370:sreg_32, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR3:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR undef %369:sgpr_128, undef %370:sreg_32, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM3:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM undef %380:sgpr_128, 16, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM3:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM undef %380:sgpr_128, 16, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM3:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %156, 0, 0 :: (load (s128) from %ir.97, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM3:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %156, 0, 0 :: (dereferenceable invariant load (s128) from %ir.97, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM4:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %163, 0, 0 :: (load (s128) from %ir.103, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM4:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %163, 0, 0 :: (dereferenceable invariant load (s128) from %ir.103, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM5:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %176, 0, 0 :: (load (s128) from %ir.111, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM5:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %176, 0, 0 :: (dereferenceable invariant load (s128) from %ir.111, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM6:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %183, 0, 0 :: (load (s128) from %ir.117, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM6:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %183, 0, 0 :: (dereferenceable invariant load (s128) from %ir.117, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM7:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %190, 0, 0 :: (load (s128) from %ir.123, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM7:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %190, 0, 0 :: (dereferenceable invariant load (s128) from %ir.123, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN2:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM2]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN2:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM2]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR4:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR undef %364:sgpr_128, [[S_ADD_I32_]], 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR4:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR undef %364:sgpr_128, [[S_ADD_I32_]], 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR5:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR undef %375:sgpr_128, [[S_ADD_I32_1]], 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR5:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR undef %375:sgpr_128, [[S_ADD_I32_1]], 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_ADD_I32_2:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR]], -98, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_2:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR]], -98, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_3:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR1]], -114, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_3:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR1]], -114, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_4:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR2]], -130, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_4:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR2]], -130, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_5:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM2]], -178, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_5:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM2]], -178, implicit-def dead $scc
; CHECK-NEXT: undef %327.sub0:sreg_64 = S_ADD_U32 [[COPY8]], [[S_LSHL_B32_]], implicit-def $scc		; CHECK-NEXT: undef %327.sub0:sreg_64 = S_ADD_U32 [[COPY8]], [[S_LSHL_B32_]], implicit-def $scc
; CHECK-NEXT: %327.sub1:sreg_64 = S_ADDC_U32 undef %42:sreg_32, [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %327.sub1:sreg_64 = S_ADDC_U32 undef %42:sreg_32, [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %335.sub0:sreg_64 = S_ADD_U32 [[COPY9]], [[S_LSHL_B32_]], implicit-def $scc		; CHECK-NEXT: undef %335.sub0:sreg_64 = S_ADD_U32 [[COPY9]], [[S_LSHL_B32_]], implicit-def $scc
; CHECK-NEXT: %335.sub1:sreg_64 = S_ADDC_U32 undef %39:sreg_32, [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %335.sub1:sreg_64 = S_ADDC_U32 undef %39:sreg_32, [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %343.sub0:sreg_64 = S_ADD_U32 [[COPY9]], [[S_LSHL_B32_1]], implicit-def $scc		; CHECK-NEXT: undef %343.sub0:sreg_64 = S_ADD_U32 [[COPY9]], [[S_LSHL_B32_1]], implicit-def $scc
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM8:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %200, 0, 0 :: (load (s128) from %ir.131, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM8:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %200, 0, 0 :: (dereferenceable invariant load (s128) from %ir.131, addrspace 4)
; CHECK-NEXT: %343.sub1:sreg_64 = S_ADDC_U32 undef %39:sreg_32, [[S_ASHR_I32_1]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %343.sub1:sreg_64 = S_ADDC_U32 undef %39:sreg_32, [[S_ASHR_I32_1]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %351.sub0:sreg_64 = S_ADD_U32 [[COPY9]], [[S_LSHL_B32_2]], implicit-def $scc		; CHECK-NEXT: undef %351.sub0:sreg_64 = S_ADD_U32 [[COPY9]], [[S_LSHL_B32_2]], implicit-def $scc
; CHECK-NEXT: %351.sub1:sreg_64 = S_ADDC_U32 undef %39:sreg_32, [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %351.sub1:sreg_64 = S_ADDC_U32 undef %39:sreg_32, [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LSHL_B32_3:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY10]], 4, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_3:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY10]], 4, implicit-def dead $scc
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN3:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM3]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN3:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM3]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_ADD_I32_6:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_LSHL_B32_3]], 16, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_6:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_LSHL_B32_3]], 16, implicit-def dead $scc
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR6:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR undef %396:sgpr_128, [[S_ADD_I32_6]], 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_SGPR6:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_SGPR undef %396:sgpr_128, [[S_ADD_I32_6]], 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN4:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM4]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN4:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM4]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM9:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %50, 224, 0 :: (load (s128) from %ir.155, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM9:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %50, 224, 0 :: (dereferenceable invariant load (s128) from %ir.155, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM10:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %210, 0, 0 :: (load (s128) from %ir.138, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM10:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %210, 0, 0 :: (dereferenceable invariant load (s128) from %ir.138, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN5:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM5]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN5:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM5]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM11:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %217, 0, 0 :: (load (s128) from %ir.144, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM11:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %217, 0, 0 :: (dereferenceable invariant load (s128) from %ir.144, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM12:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %224, 0, 0 :: (load (s128) from %ir.150, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM12:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %224, 0, 0 :: (dereferenceable invariant load (s128) from %ir.150, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN6:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM6]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN6:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM6]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN7:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM7]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN7:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM7]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN8:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM8]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN8:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM8]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_ADD_I32_7:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR4]], -217, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_7:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR4]], -217, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_8:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -233, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_8:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -233, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_9:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR5]], -249, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_9:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR5]], -249, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_10:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM3]], -297, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_10:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM3]], -297, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_11:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -313, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_11:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -313, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_12:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -329, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_12:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -329, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_13:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -345, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_13:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -345, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_14:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR6]], -441, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_14:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR6]], -441, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_U32_3:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], 160, implicit-def $scc		; CHECK-NEXT: [[S_ADD_U32_3:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], 160, implicit-def $scc
; CHECK-NEXT: [[S_ADDC_U32_3:%[0-9]+]]:sreg_32 = S_ADDC_U32 undef %36:sreg_32, 0, implicit-def dead $scc, implicit $scc		; CHECK-NEXT: [[S_ADDC_U32_3:%[0-9]+]]:sreg_32 = S_ADDC_U32 undef %36:sreg_32, 0, implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %411.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_3]], [[S_LSHL_B32_2]], implicit-def $scc		; CHECK-NEXT: undef %411.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_3]], [[S_LSHL_B32_2]], implicit-def $scc
; CHECK-NEXT: %411.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_3]], [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %411.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_3]], [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LSHL_B32_4:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY11]], 4, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_4:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY11]], 4, implicit-def dead $scc
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN9:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM10]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN9:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM10]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_ASHR_I32_4:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_4]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_4:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_4]], 31, implicit-def dead $scc
; CHECK-NEXT: undef %425.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_3]], [[S_LSHL_B32_4]], implicit-def $scc		; CHECK-NEXT: undef %425.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_3]], [[S_LSHL_B32_4]], implicit-def $scc
; CHECK-NEXT: %425.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_3]], [[S_ASHR_I32_4]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %425.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_3]], [[S_ASHR_I32_4]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_ADD_U32_4:%[0-9]+]]:sreg_32 = S_ADD_U32 %56.sub0, 168, implicit-def $scc		; CHECK-NEXT: [[S_ADD_U32_4:%[0-9]+]]:sreg_32 = S_ADD_U32 %56.sub0, 168, implicit-def $scc
; CHECK-NEXT: [[S_ADDC_U32_4:%[0-9]+]]:sreg_32 = S_ADDC_U32 undef %57:sreg_32, 0, implicit-def dead $scc, implicit $scc		; CHECK-NEXT: [[S_ADDC_U32_4:%[0-9]+]]:sreg_32 = S_ADDC_U32 undef %57:sreg_32, 0, implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM13:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %241, 0, 0 :: (load (s128) from %ir.162, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM13:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %241, 0, 0 :: (dereferenceable invariant load (s128) from %ir.162, addrspace 4)
; CHECK-NEXT: [[S_LSHL_B32_5:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY4]], 3, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_5:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY4]], 3, implicit-def dead $scc
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN10:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM11]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN10:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM11]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_ASHR_I32_5:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_5]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_5:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_5]], 31, implicit-def dead $scc
; CHECK-NEXT: undef %441.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_4]], [[S_LSHL_B32_5]], implicit-def $scc		; CHECK-NEXT: undef %441.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_4]], [[S_LSHL_B32_5]], implicit-def $scc
; CHECK-NEXT: %441.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_4]], [[S_ASHR_I32_5]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %441.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_4]], [[S_ASHR_I32_5]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %441, 0, 0 :: (load (s32) from %ir..i085.i, align 8, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %441, 0, 0 :: (dereferenceable invariant load (s32) from %ir..i085.i, align 8, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM14:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %253, 0, 0 :: (load (s128) from %ir.170, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM14:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %253, 0, 0 :: (dereferenceable invariant load (s128) from %ir.170, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN11:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM12]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN11:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM12]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM15:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %261, 0, 0 :: (load (s128) from %ir.176, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM15:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %261, 0, 0 :: (dereferenceable invariant load (s128) from %ir.176, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN12:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM9]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN12:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM9]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN13:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM13]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN13:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM13]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: %71.sub3:sgpr_128 = S_MOV_B32 553734060		; CHECK-NEXT: %71.sub3:sgpr_128 = S_MOV_B32 553734060
; CHECK-NEXT: %71.sub2:sgpr_128 = S_MOV_B32 -1		; CHECK-NEXT: %71.sub2:sgpr_128 = S_MOV_B32 -1
; CHECK-NEXT: [[COPY13:%[0-9]+]]:sgpr_128 = COPY %71		; CHECK-NEXT: [[COPY13:%[0-9]+]]:sgpr_128 = COPY %71
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM16:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %273, 0, 0 :: (load (s128) from %ir.185, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM16:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %273, 0, 0 :: (dereferenceable invariant load (s128) from %ir.185, addrspace 4)
; CHECK-NEXT: [[COPY13]].sub1:sgpr_128 = COPY %302.sub1		; CHECK-NEXT: [[COPY13]].sub1:sgpr_128 = COPY %302.sub1
; CHECK-NEXT: [[COPY13]].sub0:sgpr_128 = COPY [[S_LOAD_DWORD_IMM]]		; CHECK-NEXT: [[COPY13]].sub0:sgpr_128 = COPY [[S_LOAD_DWORD_IMM]]
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM4:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[COPY13]], 0, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM4:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[COPY13]], 0, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN14:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM14]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN14:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM14]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN15:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM15]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN15:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM15]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM17:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %286, 0, 0 :: (load (s128) from %ir.194, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM17:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %286, 0, 0 :: (dereferenceable invariant load (s128) from %ir.194, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM18:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %293, 0, 0 :: (load (s128) from %ir.200, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM18:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %293, 0, 0 :: (dereferenceable invariant load (s128) from %ir.200, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN16:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM16]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN16:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM16]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_LSHL_B32_6:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY3]], 3, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_6:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY3]], 3, implicit-def dead $scc
; CHECK-NEXT: [[BUFFER_LOAD_DWORD_OFFSET1:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_OFFSET [[S_LOAD_DWORDX4_IMM1]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_DWORD_OFFSET1:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_OFFSET [[S_LOAD_DWORDX4_IMM1]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_ASHR_I32_6:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_6]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_6:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_6]], 31, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_15:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM4]], -467, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_15:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM4]], -467, implicit-def dead $scc
; CHECK-NEXT: undef %453.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_4]], [[S_LSHL_B32_6]], implicit-def $scc		; CHECK-NEXT: undef %453.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_4]], [[S_LSHL_B32_6]], implicit-def $scc
; CHECK-NEXT: %453.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_4]], [[S_ASHR_I32_6]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %453.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_4]], [[S_ASHR_I32_6]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LOAD_DWORDX2_IMM:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM %453, 0, 0 :: (load (s64) from %ir.308, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX2_IMM:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM %453, 0, 0 :: (dereferenceable invariant load (s64) from %ir.308, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_DWORD_OFFSET2:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_OFFSET [[S_LOAD_DWORDX4_IMM17]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_DWORD_OFFSET2:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_OFFSET [[S_LOAD_DWORDX4_IMM17]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_DWORD_OFFSET3:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_OFFSET [[S_LOAD_DWORDX4_IMM18]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_DWORD_OFFSET3:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_OFFSET [[S_LOAD_DWORDX4_IMM18]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM19:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %327, 0, 0 :: (load (s128) from %ir.223, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM19:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %327, 0, 0 :: (dereferenceable invariant load (s128) from %ir.223, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM20:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %335, 0, 0 :: (load (s128) from %ir.230, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM20:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %335, 0, 0 :: (dereferenceable invariant load (s128) from %ir.230, addrspace 4)
; CHECK-NEXT: [[COPY14:%[0-9]+]]:sgpr_128 = COPY %71		; CHECK-NEXT: [[COPY14:%[0-9]+]]:sgpr_128 = COPY %71
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM21:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %343, 0, 0 :: (load (s128) from %ir.236, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM21:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %343, 0, 0 :: (dereferenceable invariant load (s128) from %ir.236, addrspace 4)
; CHECK-NEXT: [[S_AND_B32_:%[0-9]+]]:sreg_32 = S_AND_B32 [[S_LOAD_DWORDX2_IMM]].sub1, 65535, implicit-def dead $scc		; CHECK-NEXT: [[S_AND_B32_:%[0-9]+]]:sreg_32 = S_AND_B32 [[S_LOAD_DWORDX2_IMM]].sub1, 65535, implicit-def dead $scc
; CHECK-NEXT: [[COPY14]].sub0:sgpr_128 = COPY [[S_LOAD_DWORDX2_IMM]].sub0		; CHECK-NEXT: [[COPY14]].sub0:sgpr_128 = COPY [[S_LOAD_DWORDX2_IMM]].sub0
; CHECK-NEXT: [[COPY14]].sub1:sgpr_128 = COPY [[S_AND_B32_]]		; CHECK-NEXT: [[COPY14]].sub1:sgpr_128 = COPY [[S_AND_B32_]]
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM5:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[COPY14]], 0, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM5:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[COPY14]], 0, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM22:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %351, 0, 0 :: (load (s128) from %ir.242, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM22:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %351, 0, 0 :: (dereferenceable invariant load (s128) from %ir.242, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN17:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM19]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN17:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM19]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN18:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM20]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN18:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM20]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_LSHL_B32_7:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY2]], 3, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_7:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY2]], 3, implicit-def dead $scc
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN19:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM21]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN19:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM21]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_ASHR_I32_7:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_7]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_7:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_7]], 31, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_16:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM5]], -468, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_16:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM5]], -468, implicit-def dead $scc
; CHECK-NEXT: undef %468.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_4]], [[S_LSHL_B32_7]], implicit-def $scc		; CHECK-NEXT: undef %468.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_4]], [[S_LSHL_B32_7]], implicit-def $scc
; CHECK-NEXT: %468.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_4]], [[S_ASHR_I32_7]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %468.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_4]], [[S_ASHR_I32_7]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN20:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM22]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN20:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM22]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORDX2_IMM1:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM %468, 0, 0 :: (load (s64) from %ir.320, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX2_IMM1:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM %468, 0, 0 :: (dereferenceable invariant load (s64) from %ir.320, addrspace 4)
; CHECK-NEXT: [[COPY15:%[0-9]+]]:sgpr_128 = COPY %71		; CHECK-NEXT: [[COPY15:%[0-9]+]]:sgpr_128 = COPY %71
; CHECK-NEXT: [[S_AND_B32_1:%[0-9]+]]:sreg_32 = S_AND_B32 [[S_LOAD_DWORDX2_IMM1]].sub1, 65535, implicit-def dead $scc		; CHECK-NEXT: [[S_AND_B32_1:%[0-9]+]]:sreg_32 = S_AND_B32 [[S_LOAD_DWORDX2_IMM1]].sub1, 65535, implicit-def dead $scc
; CHECK-NEXT: [[COPY15]].sub0:sgpr_128 = COPY [[S_LOAD_DWORDX2_IMM1]].sub0		; CHECK-NEXT: [[COPY15]].sub0:sgpr_128 = COPY [[S_LOAD_DWORDX2_IMM1]].sub0
; CHECK-NEXT: [[COPY15]].sub1:sgpr_128 = COPY [[S_AND_B32_1]]		; CHECK-NEXT: [[COPY15]].sub1:sgpr_128 = COPY [[S_AND_B32_1]]
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM6:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[COPY15]], 0, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM6:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[COPY15]], 0, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM23:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %411, 0, 0 :: (load (s128) from %ir.282, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM23:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %411, 0, 0 :: (dereferenceable invariant load (s128) from %ir.282, addrspace 4)
; CHECK-NEXT: [[S_LOAD_DWORD_IMM1:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %488:sreg_64, 0, 0 :: (load (s32) from `i32 addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORD_IMM1:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM undef %488:sreg_64, 0, 0 :: (dereferenceable invariant load (s32) from `i32 addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: KILL %411.sub0, %411.sub1		; CHECK-NEXT: KILL %411.sub0, %411.sub1
; CHECK-NEXT: KILL undef %488:sreg_64		; CHECK-NEXT: KILL undef %488:sreg_64
; CHECK-NEXT: KILL [[COPY15]].sub0_sub1, [[COPY15]].sub2_sub3		; CHECK-NEXT: KILL [[COPY15]].sub0_sub1, [[COPY15]].sub2_sub3
; CHECK-NEXT: [[S_LSHL_B32_8:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY12]], 3, implicit-def dead $scc		; CHECK-NEXT: [[S_LSHL_B32_8:%[0-9]+]]:sreg_32 = S_LSHL_B32 [[COPY12]], 3, implicit-def dead $scc
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM24:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %425, 0, 0 :: (load (s128) from %ir.291, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM24:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %425, 0, 0 :: (dereferenceable invariant load (s128) from %ir.291, addrspace 4)
; CHECK-NEXT: [[S_ASHR_I32_8:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_8]], 31, implicit-def dead $scc		; CHECK-NEXT: [[S_ASHR_I32_8:%[0-9]+]]:sreg_32_xm0 = S_ASHR_I32 [[S_LSHL_B32_8]], 31, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_17:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM6]], -469, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_17:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM6]], -469, implicit-def dead $scc
; CHECK-NEXT: undef %485.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_4]], [[S_LSHL_B32_8]], implicit-def $scc		; CHECK-NEXT: undef %485.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_4]], [[S_LSHL_B32_8]], implicit-def $scc
; CHECK-NEXT: %485.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_4]], [[S_ASHR_I32_8]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %485.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_4]], [[S_ASHR_I32_8]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LOAD_DWORD_IMM2:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %485, 0, 0 :: (load (s32) from %ir..i0100.i, align 8, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORD_IMM2:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %485, 0, 0 :: (dereferenceable invariant load (s32) from %ir..i0100.i, align 8, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN21:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM23]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN21:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM23]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN22:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM24]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN22:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM24]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM24]]		; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM24]]
; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM23]]		; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM23]]
; CHECK-NEXT: [[S_AND_B32_2:%[0-9]+]]:sreg_32 = S_AND_B32 [[S_LOAD_DWORD_IMM1]], 65535, implicit-def dead $scc		; CHECK-NEXT: [[S_AND_B32_2:%[0-9]+]]:sreg_32 = S_AND_B32 [[S_LOAD_DWORD_IMM1]], 65535, implicit-def dead $scc
; CHECK-NEXT: [[COPY16:%[0-9]+]]:sgpr_128 = COPY %71		; CHECK-NEXT: [[COPY16:%[0-9]+]]:sgpr_128 = COPY %71
; CHECK-NEXT: [[COPY16]].sub1:sgpr_128 = COPY [[S_AND_B32_2]]		; CHECK-NEXT: [[COPY16]].sub1:sgpr_128 = COPY [[S_AND_B32_2]]
; CHECK-NEXT: [[COPY16]].sub0:sgpr_128 = COPY [[S_LOAD_DWORD_IMM2]]		; CHECK-NEXT: [[COPY16]].sub0:sgpr_128 = COPY [[S_LOAD_DWORD_IMM2]]
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM7:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[COPY16]], 0, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM7:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM [[COPY16]], 0, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[S_ADD_I32_18:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM]], -474, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_18:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM]], -474, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_19:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -475, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_19:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -475, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_20:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -491, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_20:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -491, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_21:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -507, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_21:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -507, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_22:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -539, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_22:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_SGPR3]], -539, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_I32_23:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM7]], -473, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_23:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM7]], -473, implicit-def dead $scc
; CHECK-NEXT: [[S_ADD_U32_5:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], 96, implicit-def $scc		; CHECK-NEXT: [[S_ADD_U32_5:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], 96, implicit-def $scc
; CHECK-NEXT: [[S_ADDC_U32_5:%[0-9]+]]:sreg_32 = S_ADDC_U32 undef %33:sreg_32, 0, implicit-def dead $scc, implicit $scc		; CHECK-NEXT: [[S_ADDC_U32_5:%[0-9]+]]:sreg_32 = S_ADDC_U32 undef %33:sreg_32, 0, implicit-def dead $scc, implicit $scc
; CHECK-NEXT: undef %514.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_5]], [[S_LSHL_B32_]], implicit-def $scc		; CHECK-NEXT: undef %514.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_5]], [[S_LSHL_B32_]], implicit-def $scc
; CHECK-NEXT: %514.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_5]], [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %514.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_5]], [[S_ASHR_I32_]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM25:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %514, 0, 0 :: (load (s128) from %ir.351, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM25:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %514, 0, 0 :: (dereferenceable invariant load (s128) from %ir.351, addrspace 4)
; CHECK-NEXT: undef %522.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_5]], [[S_LSHL_B32_1]], implicit-def $scc		; CHECK-NEXT: undef %522.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_5]], [[S_LSHL_B32_1]], implicit-def $scc
; CHECK-NEXT: %522.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_5]], [[S_ASHR_I32_1]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %522.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_5]], [[S_ASHR_I32_1]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM26:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %522, 0, 0 :: (load (s128) from %ir.357, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM26:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %522, 0, 0 :: (dereferenceable invariant load (s128) from %ir.357, addrspace 4)
; CHECK-NEXT: undef %530.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_5]], [[S_LSHL_B32_2]], implicit-def $scc		; CHECK-NEXT: undef %530.sub0:sreg_64 = S_ADD_U32 [[S_ADD_U32_5]], [[S_LSHL_B32_2]], implicit-def $scc
; CHECK-NEXT: %530.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_5]], [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc		; CHECK-NEXT: %530.sub1:sreg_64 = S_ADDC_U32 [[S_ADDC_U32_5]], [[S_ASHR_I32_2]], implicit-def dead $scc, implicit $scc
; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM27:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %530, 0, 0 :: (load (s128) from %ir.363, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX4_IMM27:%[0-9]+]]:sgpr_128 = S_LOAD_DWORDX4_IMM %530, 0, 0 :: (dereferenceable invariant load (s128) from %ir.363, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN23:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM25]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN23:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM25]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN24:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM26]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN24:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM26]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN25:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM27]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)		; CHECK-NEXT: [[BUFFER_LOAD_FORMAT_X_IDXEN25:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_FORMAT_X_IDXEN [[V_MOV_B32_e32_]], [[S_LOAD_DWORDX4_IMM27]], 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s32) from custom "BufferResource", align 1, addrspace 4)
; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM27]]		; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM27]]
; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM25]]		; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM25]]
; CHECK-NEXT: KILL [[V_MOV_B32_e32_]]		; CHECK-NEXT: KILL [[V_MOV_B32_e32_]]
; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM26]]		; CHECK-NEXT: KILL [[S_LOAD_DWORDX4_IMM26]]
; CHECK-NEXT: [[V_ADD_U32_e32_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e32 -2, [[BUFFER_LOAD_FORMAT_X_IDXEN]], implicit $exec		; CHECK-NEXT: [[V_ADD_U32_e32_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e32 -2, [[BUFFER_LOAD_FORMAT_X_IDXEN]], implicit $exec
▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	define amdgpu_gs void @_amdgpu_gs_main(i32 inreg %primShaderTableAddrLow, <31 x i32> inreg %userData) {
; CHECK-NEXT: [[V_OR_B32_e32_62:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_61]], [[V_ADD_U32_e32_26]], implicit $exec		; CHECK-NEXT: [[V_OR_B32_e32_62:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_61]], [[V_ADD_U32_e32_26]], implicit $exec
; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM8:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %71, 0, 0 :: (dereferenceable invariant load (s32))		; CHECK-NEXT: [[S_BUFFER_LOAD_DWORD_IMM8:%[0-9]+]]:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM %71, 0, 0 :: (dereferenceable invariant load (s32))
; CHECK-NEXT: [[V_ADD_U32_e32_28:%[0-9]+]]:vgpr_32 = V_ADD_U32_e32 -576, [[BUFFER_LOAD_FORMAT_X_IDXEN]], implicit $exec		; CHECK-NEXT: [[V_ADD_U32_e32_28:%[0-9]+]]:vgpr_32 = V_ADD_U32_e32 -576, [[BUFFER_LOAD_FORMAT_X_IDXEN]], implicit $exec
; CHECK-NEXT: [[V_OR_B32_e32_63:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_62]], [[V_ADD_U32_e32_27]], implicit $exec		; CHECK-NEXT: [[V_OR_B32_e32_63:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_62]], [[V_ADD_U32_e32_27]], implicit $exec
; CHECK-NEXT: [[V_ADD_U32_e32_29:%[0-9]+]]:vgpr_32 = V_ADD_U32_e32 -577, [[BUFFER_LOAD_FORMAT_X_IDXEN]], implicit $exec		; CHECK-NEXT: [[V_ADD_U32_e32_29:%[0-9]+]]:vgpr_32 = V_ADD_U32_e32 -577, [[BUFFER_LOAD_FORMAT_X_IDXEN]], implicit $exec
; CHECK-NEXT: [[V_OR_B32_e32_64:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_63]], [[V_ADD_U32_e32_28]], implicit $exec		; CHECK-NEXT: [[V_OR_B32_e32_64:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_63]], [[V_ADD_U32_e32_28]], implicit $exec
; CHECK-NEXT: [[V_ADD_U32_e32_30:%[0-9]+]]:vgpr_32 = V_ADD_U32_e32 -593, [[BUFFER_LOAD_FORMAT_X_IDXEN]], implicit $exec		; CHECK-NEXT: [[V_ADD_U32_e32_30:%[0-9]+]]:vgpr_32 = V_ADD_U32_e32 -593, [[BUFFER_LOAD_FORMAT_X_IDXEN]], implicit $exec
; CHECK-NEXT: [[V_OR_B32_e32_65:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_64]], [[V_ADD_U32_e32_29]], implicit $exec		; CHECK-NEXT: [[V_OR_B32_e32_65:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_64]], [[V_ADD_U32_e32_29]], implicit $exec
; CHECK-NEXT: [[S_LOAD_DWORDX8_IMM:%[0-9]+]]:sgpr_256 = S_LOAD_DWORDX8_IMM undef %564:sreg_64, 0, 0 :: (load (s256) from `<8 x i32> addrspace(4)* undef`, addrspace 4)		; CHECK-NEXT: [[S_LOAD_DWORDX8_IMM:%[0-9]+]]:sgpr_256 = S_LOAD_DWORDX8_IMM undef %564:sreg_64, 0, 0 :: (dereferenceable invariant load (s256) from `<8 x i32> addrspace(4)* undef`, addrspace 4)
; CHECK-NEXT: [[V_OR_B32_e32_66:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_65]], [[V_ADD_U32_e32_30]], implicit $exec		; CHECK-NEXT: [[V_OR_B32_e32_66:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[V_OR_B32_e32_65]], [[V_ADD_U32_e32_30]], implicit $exec
; CHECK-NEXT: [[S_ADD_I32_24:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM8]], -594, implicit-def dead $scc		; CHECK-NEXT: [[S_ADD_I32_24:%[0-9]+]]:sreg_32 = S_ADD_I32 [[S_BUFFER_LOAD_DWORD_IMM8]], -594, implicit-def dead $scc
; CHECK-NEXT: [[V_OR_B32_e32_67:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[S_ADD_I32_24]], [[V_OR_B32_e32_66]], implicit $exec		; CHECK-NEXT: [[V_OR_B32_e32_67:%[0-9]+]]:vgpr_32 = V_OR_B32_e32 [[S_ADD_I32_24]], [[V_OR_B32_e32_66]], implicit $exec
; CHECK-NEXT: [[V_CMP_EQ_U32_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U32_e64 0, [[V_OR_B32_e32_67]], implicit $exec		; CHECK-NEXT: [[V_CMP_EQ_U32_e64_:%[0-9]+]]:sreg_32_xm0_xexec = V_CMP_EQ_U32_e64 0, [[V_OR_B32_e32_67]], implicit $exec
; CHECK-NEXT: undef %692.sub3:vreg_128 = V_CNDMASK_B32_e64 0, 0, 0, 1, [[V_CMP_EQ_U32_e64_]], implicit $exec		; CHECK-NEXT: undef %692.sub3:vreg_128 = V_CNDMASK_B32_e64 0, 0, 0, 1, [[V_CMP_EQ_U32_e64_]], implicit $exec
; CHECK-NEXT: IMAGE_STORE_V4_V2_gfx10 %692, undef %578:vreg_64, [[S_LOAD_DWORDX8_IMM]], 15, 1, -1, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store (s128) into custom "ImageResource")		; CHECK-NEXT: IMAGE_STORE_V4_V2_gfx10 %692, undef %578:vreg_64, [[S_LOAD_DWORDX8_IMM]], 15, 1, -1, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store (s128) into custom "ImageResource")
; CHECK-NEXT: S_ENDPGM 0		; CHECK-NEXT: S_ENDPGM 0
.expVert:		.expVert:
▲ Show 20 Lines • Show All 442 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/twoaddr-constrain.ll

	; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
	; RUN: llc -global-isel -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -stop-after twoaddressinstruction < %s \| FileCheck %s			; RUN: llc -global-isel -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -stop-after twoaddressinstruction < %s \| FileCheck %s

	; Check that %16 gets constrained to register class sgpr_96_with_sub0_sub1.			; Check that %16 gets constrained to register class sgpr_96_with_sub0_sub1.
	define amdgpu_ps <3 x i32> @s_load_constant_v3i32_align4(<3 x i32> addrspace(4)* inreg %ptr) {			define amdgpu_ps <3 x i32> @s_load_constant_v3i32_align4(<3 x i32> addrspace(4)* inreg %ptr) {
	; CHECK-LABEL: name: s_load_constant_v3i32_align4			; CHECK-LABEL: name: s_load_constant_v3i32_align4
	; CHECK: bb.0 (%ir-block.0):			; CHECK: bb.0 (%ir-block.0):
	; CHECK-NEXT: liveins: $sgpr0, $sgpr1			; CHECK-NEXT: liveins: $sgpr0, $sgpr1
	; CHECK-NEXT: {{ $}}			; CHECK-NEXT: {{ $}}
	; CHECK-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY killed $sgpr0			; CHECK-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY killed $sgpr0
	; CHECK-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY killed $sgpr1			; CHECK-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY killed $sgpr1
	; CHECK-NEXT: undef %0.sub0:sreg_64 = COPY killed [[COPY]]			; CHECK-NEXT: undef %0.sub0:sreg_64 = COPY killed [[COPY]]
	; CHECK-NEXT: %0.sub1:sreg_64 = COPY killed [[COPY1]]			; CHECK-NEXT: %0.sub1:sreg_64 = COPY killed [[COPY1]]
	; CHECK-NEXT: [[S_LOAD_DWORDX2_IMM:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 0, 0 :: (load (<2 x s32>) from %ir.ptr, align 4, addrspace 4)			; CHECK-NEXT: [[S_LOAD_DWORDX2_IMM:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM %0, 0, 0 :: (dereferenceable invariant load (<2 x s32>) from %ir.ptr, align 4, addrspace 4)
	; CHECK-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM killed %0, 8, 0 :: (load (s32) from %ir.ptr + 8, addrspace 4)			; CHECK-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM killed %0, 8, 0 :: (dereferenceable invariant load (s32) from %ir.ptr + 8, addrspace 4)
	; CHECK-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_LOAD_DWORDX2_IMM]].sub0			; CHECK-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_LOAD_DWORDX2_IMM]].sub0
	; CHECK-NEXT: $sgpr0 = COPY killed [[COPY2]]			; CHECK-NEXT: $sgpr0 = COPY killed [[COPY2]]
	; CHECK-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY killed [[S_LOAD_DWORDX2_IMM]].sub1			; CHECK-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY killed [[S_LOAD_DWORDX2_IMM]].sub1
	; CHECK-NEXT: $sgpr1 = COPY killed [[COPY3]]			; CHECK-NEXT: $sgpr1 = COPY killed [[COPY3]]
	; CHECK-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY killed [[S_LOAD_DWORD_IMM]]			; CHECK-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY killed [[S_LOAD_DWORD_IMM]]
	; CHECK-NEXT: $sgpr2 = COPY killed [[COPY4]]			; CHECK-NEXT: $sgpr2 = COPY killed [[COPY4]]
	; CHECK-NEXT: SI_RETURN_TO_EPILOG implicit killed $sgpr0, implicit killed $sgpr1, implicit killed $sgpr2			; CHECK-NEXT: SI_RETURN_TO_EPILOG implicit killed $sgpr0, implicit killed $sgpr1, implicit killed $sgpr2
	%load = load <3 x i32>, <3 x i32> addrspace(4)* %ptr, align 4			%load = load <3 x i32>, <3 x i32> addrspace(4)* %ptr, align 4
	ret <3 x i32> %load			ret <3 x i32> %load
	}			}

llvm/test/CodeGen/AMDGPU/vgpr-liverange-ir.ll

Show First 20 Lines • Show All 573 Lines • ▼ Show 20 Lines	define protected amdgpu_kernel void @nested_waterfalls(%tex* addrspace(1)* %tex.coerce) local_unnamed_addr {
; SI-NEXT: bb.1.if.then:		; SI-NEXT: bb.1.if.then:
; SI-NEXT: successors: %bb.2(0x80000000)		; SI-NEXT: successors: %bb.2(0x80000000)
; SI-NEXT: {{ $}}		; SI-NEXT: {{ $}}
; SI-NEXT: [[V_LSHLREV_B64_e64_:%[0-9]+]]:vreg_64 = V_LSHLREV_B64_e64 3, killed [[REG_SEQUENCE]], implicit $exec		; SI-NEXT: [[V_LSHLREV_B64_e64_:%[0-9]+]]:vreg_64 = V_LSHLREV_B64_e64 3, killed [[REG_SEQUENCE]], implicit $exec
; SI-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[S_LOAD_DWORDX2_IMM]].sub0, [[V_LSHLREV_B64_e64_]].sub0, 0, implicit $exec		; SI-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[S_LOAD_DWORDX2_IMM]].sub0, [[V_LSHLREV_B64_e64_]].sub0, 0, implicit $exec
; SI-NEXT: %69:vgpr_32, dead %71:sreg_32_xm0_xexec = V_ADDC_U32_e64 killed [[S_LOAD_DWORDX2_IMM]].sub1, killed [[V_LSHLREV_B64_e64_]].sub1, killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec		; SI-NEXT: %69:vgpr_32, dead %71:sreg_32_xm0_xexec = V_ADDC_U32_e64 killed [[S_LOAD_DWORDX2_IMM]].sub1, killed [[V_LSHLREV_B64_e64_]].sub1, killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
; SI-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE killed [[V_ADD_CO_U32_e64_]], %subreg.sub0, killed %69, %subreg.sub1		; SI-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE killed [[V_ADD_CO_U32_e64_]], %subreg.sub0, killed %69, %subreg.sub1
; SI-NEXT: [[GLOBAL_LOAD_DWORDX2_:%[0-9]+]]:vreg_64 = GLOBAL_LOAD_DWORDX2 killed [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s64) from %ir.idx, addrspace 1)		; SI-NEXT: [[GLOBAL_LOAD_DWORDX2_:%[0-9]+]]:vreg_64 = GLOBAL_LOAD_DWORDX2 killed [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s64) from %ir.idx, addrspace 1)
; SI-NEXT: [[GLOBAL_LOAD_DWORDX4_:%[0-9]+]]:vreg_128 = GLOBAL_LOAD_DWORDX4 [[GLOBAL_LOAD_DWORDX2_]], 16, 0, implicit $exec :: (load (s128) from %ir.6 + 16, addrspace 4)		; SI-NEXT: [[GLOBAL_LOAD_DWORDX4_:%[0-9]+]]:vreg_128 = GLOBAL_LOAD_DWORDX4 [[GLOBAL_LOAD_DWORDX2_]], 16, 0, implicit $exec :: (dereferenceable invariant load (s128) from %ir.6 + 16, addrspace 4)
; SI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_]].sub3		; SI-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_]].sub3
; SI-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_]].sub2		; SI-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_]].sub2
; SI-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_]].sub1		; SI-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_]].sub1
; SI-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY killed [[GLOBAL_LOAD_DWORDX4_]].sub0		; SI-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY killed [[GLOBAL_LOAD_DWORDX4_]].sub0
; SI-NEXT: [[GLOBAL_LOAD_DWORDX4_1:%[0-9]+]]:vreg_128 = GLOBAL_LOAD_DWORDX4 [[GLOBAL_LOAD_DWORDX2_]], 0, 0, implicit $exec :: (load (s128) from %ir.6, align 32, addrspace 4)		; SI-NEXT: [[GLOBAL_LOAD_DWORDX4_1:%[0-9]+]]:vreg_128 = GLOBAL_LOAD_DWORDX4 [[GLOBAL_LOAD_DWORDX2_]], 0, 0, implicit $exec :: (dereferenceable invariant load (s128) from %ir.6, align 32, addrspace 4)
; SI-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_1]].sub3		; SI-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_1]].sub3
; SI-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_1]].sub2		; SI-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_1]].sub2
; SI-NEXT: [[COPY8:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_1]].sub1		; SI-NEXT: [[COPY8:%[0-9]+]]:vgpr_32 = COPY [[GLOBAL_LOAD_DWORDX4_1]].sub1
; SI-NEXT: [[COPY9:%[0-9]+]]:vgpr_32 = COPY killed [[GLOBAL_LOAD_DWORDX4_1]].sub0		; SI-NEXT: [[COPY9:%[0-9]+]]:vgpr_32 = COPY killed [[GLOBAL_LOAD_DWORDX4_1]].sub0
; SI-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:vreg_256 = REG_SEQUENCE killed [[COPY9]], %subreg.sub0, killed [[COPY8]], %subreg.sub1, killed [[COPY7]], %subreg.sub2, killed [[COPY6]], %subreg.sub3, killed [[COPY5]], %subreg.sub4, killed [[COPY4]], %subreg.sub5, killed [[COPY3]], %subreg.sub6, killed [[COPY2]], %subreg.sub7		; SI-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:vreg_256 = REG_SEQUENCE killed [[COPY9]], %subreg.sub0, killed [[COPY8]], %subreg.sub1, killed [[COPY7]], %subreg.sub2, killed [[COPY6]], %subreg.sub3, killed [[COPY5]], %subreg.sub4, killed [[COPY4]], %subreg.sub5, killed [[COPY3]], %subreg.sub6, killed [[COPY2]], %subreg.sub7
; SI-NEXT: [[GLOBAL_LOAD_DWORDX4_2:%[0-9]+]]:vreg_128 = GLOBAL_LOAD_DWORDX4 killed [[GLOBAL_LOAD_DWORDX2_]], 48, 0, implicit $exec :: (load (s128) from %ir.8, addrspace 4)		; SI-NEXT: [[GLOBAL_LOAD_DWORDX4_2:%[0-9]+]]:vreg_128 = GLOBAL_LOAD_DWORDX4 killed [[GLOBAL_LOAD_DWORDX2_]], 48, 0, implicit $exec :: (dereferenceable invariant load (s128) from %ir.8, addrspace 4)
; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32_xm0_xexec = S_MOV_B32 $exec_lo		; SI-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32_xm0_xexec = S_MOV_B32 $exec_lo
; SI-NEXT: {{ $}}		; SI-NEXT: {{ $}}
; SI-NEXT: bb.2:		; SI-NEXT: bb.2:
; SI-NEXT: successors: %bb.3(0x80000000)		; SI-NEXT: successors: %bb.3(0x80000000)
; SI-NEXT: {{ $}}		; SI-NEXT: {{ $}}
; SI-NEXT: [[V_READFIRSTLANE_B32_:%[0-9]+]]:sgpr_32 = V_READFIRSTLANE_B32 [[REG_SEQUENCE2]].sub0, implicit $exec		; SI-NEXT: [[V_READFIRSTLANE_B32_:%[0-9]+]]:sgpr_32 = V_READFIRSTLANE_B32 [[REG_SEQUENCE2]].sub0, implicit $exec
; SI-NEXT: [[V_READFIRSTLANE_B32_1:%[0-9]+]]:sgpr_32 = V_READFIRSTLANE_B32 [[REG_SEQUENCE2]].sub1, implicit $exec		; SI-NEXT: [[V_READFIRSTLANE_B32_1:%[0-9]+]]:sgpr_32 = V_READFIRSTLANE_B32 [[REG_SEQUENCE2]].sub1, implicit $exec
; SI-NEXT: [[REG_SEQUENCE3:%[0-9]+]]:sgpr_64 = REG_SEQUENCE [[V_READFIRSTLANE_B32_]], %subreg.sub0, [[V_READFIRSTLANE_B32_1]], %subreg.sub1		; SI-NEXT: [[REG_SEQUENCE3:%[0-9]+]]:sgpr_64 = REG_SEQUENCE [[V_READFIRSTLANE_B32_]], %subreg.sub0, [[V_READFIRSTLANE_B32_1]], %subreg.sub1
▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/unfoldMemoryOperand.mir

Show First 20 Lines • Show All 82 Lines • ▼ Show 20 Lines
body: \|		body: \|
; CHECK-LABEL: name: _Z3foov		; CHECK-LABEL: name: _Z3foov
; CHECK: bb.0 (%ir-block.0):		; CHECK: bb.0 (%ir-block.0):
; CHECK-NEXT: successors: %bb.2(0x80000000)		; CHECK-NEXT: successors: %bb.2(0x80000000)
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: renamable $eax = MOV32r0 implicit-def dead $eflags		; CHECK-NEXT: renamable $eax = MOV32r0 implicit-def dead $eflags
; CHECK-NEXT: renamable $rcx = MOV64ri32 -4096		; CHECK-NEXT: renamable $rcx = MOV64ri32 -4096
; CHECK-NEXT: [[MOV64ri32_:%[0-9]+]]:gr64 = MOV64ri32 -4096		; CHECK-NEXT: [[MOV64ri32_:%[0-9]+]]:gr64 = MOV64ri32 -4096
; CHECK-NEXT: [[MOV64rm:%[0-9]+]]:gr64 = MOV64rm $rip, 1, $noreg, @y, $noreg :: (dereferenceable load (s64) from @y, !tbaa !3)		; CHECK-NEXT: [[MOV64rm:%[0-9]+]]:gr64 = MOV64rm $rip, 1, $noreg, @y, $noreg :: (dereferenceable invariant load (s64) from @y, !tbaa !3)
; CHECK-NEXT: JMP_1 %bb.2		; CHECK-NEXT: JMP_1 %bb.2
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1 (%ir-block.4):		; CHECK-NEXT: bb.1 (%ir-block.4):
; CHECK-NEXT: RET 0		; CHECK-NEXT: RET 0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.2 (%ir-block.5):		; CHECK-NEXT: bb.2 (%ir-block.5):
; CHECK-NEXT: successors: %bb.1(0x04000000), %bb.2(0x7c000000)		; CHECK-NEXT: successors: %bb.1(0x04000000), %bb.2(0x7c000000)
; CHECK-NEXT: liveins: $eax, $rcx		; CHECK-NEXT: liveins: $eax, $rcx
Show All 10 Lines	bb.0 (%ir-block.0):
renamable $rcx = MOV64ri32 -4096		renamable $rcx = MOV64ri32 -4096
JMP_1 %bb.2		JMP_1 %bb.2
bb.1 (%ir-block.4):		bb.1 (%ir-block.4):
RET 0		RET 0
bb.2 (%ir-block.5):		bb.2 (%ir-block.5):
successors: %bb.1(0x04000000), %bb.2(0x7c000000)		successors: %bb.1(0x04000000), %bb.2(0x7c000000)
liveins: $eax, $rcx		liveins: $eax, $rcx
%2:gr64 = MOV64ri32 -4096		%2:gr64 = MOV64ri32 -4096
CMP64mi32 $rip, 1, $noreg, @y, $noreg, @x, implicit-def $eflags :: (dereferenceable load (s64) from @y, !tbaa !3)		CMP64mi32 $rip, 1, $noreg, @y, $noreg, @x, implicit-def $eflags :: (dereferenceable invariant load (s64) from @y, !tbaa !3)
renamable $al = SETCCr 4, implicit killed $eflags, implicit killed $eax, implicit-def $eax		renamable $al = SETCCr 4, implicit killed $eflags, implicit killed $eax, implicit-def $eax
MOV32mr renamable $rcx, 1, $noreg, @z + 4096, $noreg, renamable $eax :: (store (s32) into %ir.scevgep, !tbaa !7)		MOV32mr renamable $rcx, 1, $noreg, @z + 4096, $noreg, renamable $eax :: (store (s32) into %ir.scevgep, !tbaa !7)
renamable $rcx = ADD64ri8 killed renamable $rcx, 4, implicit-def $eflags		renamable $rcx = ADD64ri8 killed renamable $rcx, 4, implicit-def $eflags
JCC_1 %bb.1, 4, implicit killed $eflags		JCC_1 %bb.1, 4, implicit killed $eflags
JMP_1 %bb.2		JMP_1 %bb.2

...		...

This is an archive of the discontinued LLVM Phabricator instance.

CodeGen: Remove AliasAnalysis from regallocClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 440198

llvm/include/llvm/CodeGen/GlobalISel/IRTranslator.h

llvm/include/llvm/CodeGen/LiveIntervals.h

llvm/include/llvm/CodeGen/LiveRangeEdit.h

llvm/include/llvm/CodeGen/MachineInstr.h

llvm/include/llvm/CodeGen/SelectionDAG.h

llvm/include/llvm/CodeGen/TargetInstrInfo.h

llvm/lib/CodeGen/CalcSpillWeights.cpp

llvm/lib/CodeGen/EarlyIfConversion.cpp

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp

llvm/lib/CodeGen/InlineSpiller.cpp

llvm/lib/CodeGen/LiveIntervals.cpp

llvm/lib/CodeGen/LiveRangeEdit.cpp

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp

llvm/lib/CodeGen/MachineCSE.cpp

llvm/lib/CodeGen/MachineInstr.cpp

llvm/lib/CodeGen/MachineLICM.cpp

llvm/lib/CodeGen/MachinePipeliner.cpp

llvm/lib/CodeGen/RegAllocBasic.cpp

llvm/lib/CodeGen/RegAllocGreedy.h

llvm/lib/CodeGen/RegAllocGreedy.cpp

llvm/lib/CodeGen/RegAllocScore.h

llvm/lib/CodeGen/RegAllocScore.cpp

llvm/lib/CodeGen/RegisterCoalescer.cpp

llvm/lib/CodeGen/ScheduleDAGInstrs.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/CodeGen/SplitKit.h

llvm/lib/CodeGen/SplitKit.cpp

llvm/lib/CodeGen/TargetInstrInfo.cpp

llvm/lib/Target/AMDGPU/GCNSchedStrategy.h

llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp

llvm/lib/Target/AMDGPU/SIInstrInfo.h

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

llvm/lib/Target/ARM/ARMBaseInstrInfo.h

llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp

llvm/lib/Target/PowerPC/PPCInstrInfo.h

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.h

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp

llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp

llvm/lib/Target/X86/X86InstrInfo.h

llvm/lib/Target/X86/X86InstrInfo.cpp

llvm/test/CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll

llvm/test/CodeGen/AArch64/arm64-memcpy-inline.ll

llvm/test/CodeGen/AMDGPU/GlobalISel/function-returns.ll

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_vs.ll

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-non-fixed.ll

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call.ll

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-invariant.ll

llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll

llvm/test/CodeGen/AMDGPU/llc-pipeline.ll

llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll

llvm/test/CodeGen/AMDGPU/twoaddr-constrain.ll

llvm/test/CodeGen/AMDGPU/vgpr-liverange-ir.ll

llvm/test/CodeGen/X86/unfoldMemoryOperand.mir

CodeGen: Remove AliasAnalysis from regalloc
ClosedPublic