This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
-
DAGCombiner.cpp
-
Target/AMDGPU/
-
AMDGPU/
-
AMDGPUISelLowering.h
-
AMDGPUISelLowering.cpp
-
test/CodeGen/
-
CodeGen/
-
AMDGPU/
-
cvt_f32_ubyte.ll
-
fast-unaligned-load-store.global.ll
-
fast-unaligned-load-store.private.ll
-
idot8s.ll
-
idot8u.ll
-
insert_vector_dynelt.ll
-
insert_vector_elt.ll
-
BPF/
-
pr57872.ll
-
Mips/
-
cconv/
-
return-struct.ll
-
vector.ll
-
load-store-left-right.ll
-
unalignedload.ll
-
RISCV/
-
bswap-bitreverse.ll
-
rvv/
-
fixed-vectors-unaligned.ll
-
srem-seteq-illegal-types.ll
-
unaligned-load-store.ll
-
SystemZ/
-
store_nonbytesized_vecs.ll
-
Thumb/
-
urem-seteq-illegal-types.ll
-
X86/
-
bool-vector.ll
-
combine-bitreverse.ll
-
is_fpclass.ll
-
vector-sext.ll

Differential D136042

[DAG] Enable combineShiftOfShiftedLogic folds after type legalization
ClosedPublic

Authored by RKSimon on Oct 16 2022, 10:56 AM.

Download Raw Diff

Details

Reviewers

foad
craig.topper
pengfei
spatel
sdardis
dmgreen
yonghong-song
uweigand

Commits

rG78739fdb4d84: [DAG] Enable combineShiftOfShiftedLogic folds after type legalization

Summary

This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests).

Fixes #57872

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

RKSimon created this revision.Oct 16 2022, 10:56 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 16 2022, 10:56 AM

Herald added subscribers: kosarev, StephenFan, frasercrmck and 27 others. · View Herald Transcript

RKSimon requested review of this revision.Oct 16 2022, 10:56 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 16 2022, 10:56 AM

Herald added subscribers: • pcwang-thead, MaskRay. · View Herald Transcript

RKSimon added a reviewer: uweigand.Oct 16 2022, 10:57 AM

Harbormaster completed remote builds in B192405: Diff 468086.Oct 16 2022, 12:11 PM

The SystemZ change is a clear improvement, so that part LGTM. Thanks!

RKSimon mentioned this in rGefd0d6626943: [AMDGPU] Add regression test cases reported on D136042.Oct 17 2022, 6:54 AM

RKSimon mentioned this in D136081: [DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y,c2-c1)) iff c2 >= c1.Oct 17 2022, 7:22 AM

Thanks. The change looks good to me from BPF perspective as it fixed the regression in https://github.com/llvm/llvm-project/issues/57872.

@foad What do you think about the AMDGPU diffs? Do you want me to focus on getting D136081 done first?

In D136042#3864932, @RKSimon wrote:

@foad What do you think about the AMDGPU diffs? Do you want me to focus on getting D136081 done first?

The diffs look OK to me, modulo one pretty minor regression noted inline.

As for the general approach of putting the smarts in AMDGPUTargetLowering::isDesirableToCommuteWithShift: does AMDGPU have special code to match or(shl(load_zext(),c), load_zext()) and merge it into a wider load? If so, could it be improved to match the case where both loads are shifted? On the other hand, since you've already coded and tested this approach, I think it is fine.

llvm/test/CodeGen/AMDGPU/sdiv.ll
1652 ↗	(On Diff #468086)	Seems like we are losing a couple of bfe formations here, and D136081 does not fix it.

RKSimon mentioned this in rG42230efccf8f: [DAG] Fold (sra (or (shl x, c1), (shl y, c2)), c1) -> (sext_inreg (or x, (shl y….Oct 19 2022, 3:19 AM

RKSimon mentioned this in rG9708d88017d0: Revert rG42230efccf8fe1185be5fa6c23dce0a8183d6ec9 "[DAG] Fold (sra (or (shl x….Oct 19 2022, 4:07 AM

Extend AMDGPUTargetLowering::isDesirableToCommuteWithShift to avoid losing SR*(SHL(X,C1),C2) patterns that can fold to BFE instructions

@foad I gave up on D136081 and instead just tried harder not to lose patterns that the existing BFE match code can handle.

Harbormaster completed remote builds in B192993: Diff 468896.Oct 19 2022, 7:28 AM

minor cleanup (remove unnecessary check for ISD::SHL)

Harbormaster completed remote builds in B193009: Diff 468922.Oct 19 2022, 9:14 AM

ping? I think everything is covered now

yonghong-song accepted this revision.Oct 27 2022, 10:48 PM

This revision is now accepted and ready to land.Oct 27 2022, 10:48 PM

This revision was landed with ongoing or failed builds.Oct 29 2022, 4:30 AM

Closed by commit rG78739fdb4d84: [DAG] Enable combineShiftOfShiftedLogic folds after type legalization (authored by RKSimon). · Explain Why

This revision was automatically updated to reflect the committed changes.

RKSimon added a commit: rG78739fdb4d84: [DAG] Enable combineShiftOfShiftedLogic folds after type legalization.

RKSimon mentioned this in rGadf3daae1c10: [ARC] Regenerate ldst.ll.Oct 29 2022, 6:10 AM

AMDGPU parts look fine, thanks, and sorry for the late review.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

10 lines

Target/

AMDGPU/

AMDGPUISelLowering.h

3 lines

AMDGPUISelLowering.cpp

33 lines

test/

CodeGen/

AMDGPU/

cvt_f32_ubyte.ll

19 lines

fast-unaligned-load-store.global.ll

24 lines

fast-unaligned-load-store.private.ll

12 lines

idot8s.ll

44 lines

idot8u.ll

54 lines

insert_vector_dynelt.ll

349 lines

insert_vector_elt.ll

24 lines

BPF/

pr57872.ll

177 lines

Mips/

cconv/

return-struct.ll

12 lines

vector.ll

494 lines

load-store-left-right.ll

70 lines

unalignedload.ll

12 lines

RISCV/

bswap-bitreverse.ll

28 lines

rvv/

fixed-vectors-unaligned.ll

40 lines

srem-seteq-illegal-types.ll

163 lines

unaligned-load-store.ll

60 lines

SystemZ/

store_nonbytesized_vecs.ll

57 lines

Thumb/

urem-seteq-illegal-types.ll

6 lines

X86/

bool-vector.ll

8 lines

combine-bitreverse.ll

52 lines

is_fpclass.ll

8 lines

vector-sext.ll

16 lines

Diff 468922

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,849 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitShiftByConstant(SDNode *N) {
if (isBitwiseNot(N->getOperand(0)))		if (isBitwiseNot(N->getOperand(0)))
return SDValue();		return SDValue();

// The inner binop must be one-use, since we want to replace it.		// The inner binop must be one-use, since we want to replace it.
SDValue LHS = N->getOperand(0);		SDValue LHS = N->getOperand(0);
if (!LHS.hasOneUse() \|\| !TLI.isDesirableToCommuteWithShift(N, Level))		if (!LHS.hasOneUse() \|\| !TLI.isDesirableToCommuteWithShift(N, Level))
return SDValue();		return SDValue();

// TODO: This is limited to early combining because it may reveal regressions		// Fold shift(bitop(shift(x,c1),y), c2) -> bitop(shift(x,c1+c2),shift(y,c2)).
// otherwise. But since we just checked a target hook to see if this is
// desirable, that should have filtered out cases where this interferes
// with some other pattern matching.
if (!LegalTypes)
if (SDValue R = combineShiftOfShiftedLogic(N, DAG))		if (SDValue R = combineShiftOfShiftedLogic(N, DAG))
return R;		return R;

// We want to pull some binops through shifts, so that we have (and (shift))		// We want to pull some binops through shifts, so that we have (and (shift))
// instead of (shift (and)), likewise for add, or, xor, etc. This sort of		// instead of (shift (and)), likewise for add, or, xor, etc. This sort of
// thing happens with address calculations, so it's important to canonicalize		// thing happens with address calculations, so it's important to canonicalize
// it.		// it.
switch (LHS.getOpcode()) {		switch (LHS.getOpcode()) {
default:		default:
return SDValue();		return SDValue();
▲ Show 20 Lines • Show All 9,991 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h

Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	public:

SDValue getNegatedExpression(SDValue Op, SelectionDAG &DAG,		SDValue getNegatedExpression(SDValue Op, SelectionDAG &DAG,
bool LegalOperations, bool ForCodeSize,		bool LegalOperations, bool ForCodeSize,
NegatibleCost &Cost,		NegatibleCost &Cost,
unsigned Depth) const override;		unsigned Depth) const override;

bool isNarrowingProfitable(EVT VT1, EVT VT2) const override;		bool isNarrowingProfitable(EVT VT1, EVT VT2) const override;

		bool isDesirableToCommuteWithShift(const SDNode *N,
		CombineLevel Level) const override;

EVT getTypeForExtReturn(LLVMContext &Context, EVT VT,		EVT getTypeForExtReturn(LLVMContext &Context, EVT VT,
ISD::NodeType ExtendKind) const override;		ISD::NodeType ExtendKind) const override;

MVT getVectorIdxTy(const DataLayout &) const override;		MVT getVectorIdxTy(const DataLayout &) const override;
bool isSelectSupported(SelectSupportKind) const override;		bool isSelectSupported(SelectSupportKind) const override;

bool isFPImmLegal(const APFloat &Imm, EVT VT,		bool isFPImmLegal(const APFloat &Imm, EVT VT,
bool ForCodeSize) const override;		bool ForCodeSize) const override;
▲ Show 20 Lines • Show All 363 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

Show First 20 Lines • Show All 833 Lines • ▼ Show 20 Lines	bool AMDGPUTargetLowering::isNarrowingProfitable(EVT SrcVT, EVT DestVT) const {
// limited number of native 64-bit operations. Shrinking an operation to fit		// limited number of native 64-bit operations. Shrinking an operation to fit
// in a single 32-bit register should always be helpful. As currently used,		// in a single 32-bit register should always be helpful. As currently used,
// this is much less general than the name suggests, and is only used in		// this is much less general than the name suggests, and is only used in
// places trying to reduce the sizes of loads. Shrinking loads to < 32-bits is		// places trying to reduce the sizes of loads. Shrinking loads to < 32-bits is
// not profitable, and may actually be harmful.		// not profitable, and may actually be harmful.
return SrcVT.getSizeInBits() > 32 && DestVT.getSizeInBits() == 32;		return SrcVT.getSizeInBits() > 32 && DestVT.getSizeInBits() == 32;
}		}

		bool AMDGPUTargetLowering::isDesirableToCommuteWithShift(
		const SDNode* N, CombineLevel Level) const {
		assert((N->getOpcode() == ISD::SHL \|\| N->getOpcode() == ISD::SRA \|\|
		N->getOpcode() == ISD::SRL) &&
		"Expected shift op");
		// Always commute pre-type legalization and right shifts.
		// We're looking for shl(or(x,y),z) patterns.
		if (Level < CombineLevel::AfterLegalizeTypes \|\|
		N->getOpcode() != ISD::SHL \|\| N->getOperand(0).getOpcode() != ISD::OR)
		return true;

		// If only user is a i32 right-shift, then don't destroy a BFE pattern.
		if (N->getValueType(0) == MVT::i32 && N->use_size() == 1 &&
		(N->use_begin()->getOpcode() == ISD::SRA \|\|
		N->use_begin()->getOpcode() == ISD::SRL))
		return false;

		// Don't destroy or(shl(load_zext(),c), load_zext()) patterns.
		auto IsShiftAndLoad = [](SDValue LHS, SDValue RHS) {
		if (LHS.getOpcode() != ISD::SHL)
		return false;
		auto *RHSLd = dyn_cast<LoadSDNode>(RHS);
		auto *LHS0 = dyn_cast<LoadSDNode>(LHS.getOperand(0));
		auto *LHS1 = dyn_cast<ConstantSDNode>(LHS.getOperand(1));
		return LHS0 && LHS1 && RHSLd && LHS0->getExtensionType() == ISD::ZEXTLOAD &&
		LHS1->getAPIntValue() == LHS0->getMemoryVT().getScalarSizeInBits() &&
		RHSLd->getExtensionType() == ISD::ZEXTLOAD;
		};
		SDValue LHS = N->getOperand(0).getOperand(0);
		SDValue RHS = N->getOperand(0).getOperand(1);
		return !(IsShiftAndLoad(LHS, RHS) \|\| IsShiftAndLoad(RHS, LHS));
		}

//===---------------------------------------------------------------------===//		//===---------------------------------------------------------------------===//
// TargetLowering Callbacks		// TargetLowering Callbacks
//===---------------------------------------------------------------------===//		//===---------------------------------------------------------------------===//

CCAssignFn *AMDGPUCallLowering::CCAssignFnForCall(CallingConv::ID CC,		CCAssignFn *AMDGPUCallLowering::CCAssignFnForCall(CallingConv::ID CC,
bool IsVarArg) {		bool IsVarArg) {
switch (CC) {		switch (CC) {
case CallingConv::AMDGPU_VS:		case CallingConv::AMDGPU_VS:
▲ Show 20 Lines • Show All 3,987 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll

	Show First 20 Lines • Show All 1,442 Lines • ▼ Show 20 Lines
	; SI-NEXT: s_mov_b32 s2, -1			; SI-NEXT: s_mov_b32 s2, -1
	; SI-NEXT: s_waitcnt lgkmcnt(0)			; SI-NEXT: s_waitcnt lgkmcnt(0)
	; SI-NEXT: s_mov_b32 s0, s6			; SI-NEXT: s_mov_b32 s0, s6
	; SI-NEXT: s_mov_b32 s1, s7			; SI-NEXT: s_mov_b32 s1, s7
	; SI-NEXT: s_mov_b32 s6, s2			; SI-NEXT: s_mov_b32 s6, s2
	; SI-NEXT: s_mov_b32 s7, s3			; SI-NEXT: s_mov_b32 s7, s3
	; SI-NEXT: s_waitcnt vmcnt(0)			; SI-NEXT: s_waitcnt vmcnt(0)
	; SI-NEXT: v_lshrrev_b32_e32 v5, 16, v4			; SI-NEXT: v_lshrrev_b32_e32 v5, 16, v4
	; SI-NEXT: v_lshrrev_b32_e32 v6, 24, v4
	; SI-NEXT: v_and_b32_e32 v7, 0xff00, v4
	; SI-NEXT: v_cvt_f32_ubyte3_e32 v3, v4			; SI-NEXT: v_cvt_f32_ubyte3_e32 v3, v4
	; SI-NEXT: v_cvt_f32_ubyte2_e32 v2, v4			; SI-NEXT: v_cvt_f32_ubyte2_e32 v2, v4
	; SI-NEXT: v_cvt_f32_ubyte1_e32 v1, v4			; SI-NEXT: v_cvt_f32_ubyte1_e32 v1, v4
	; SI-NEXT: v_cvt_f32_ubyte0_e32 v0, v4			; SI-NEXT: v_cvt_f32_ubyte0_e32 v0, v4
	; SI-NEXT: v_add_i32_e32 v4, vcc, 9, v4			; SI-NEXT: v_add_i32_e32 v7, vcc, 9, v4
				; SI-NEXT: v_and_b32_e32 v6, 0xff00, v4
	; SI-NEXT: buffer_store_dwordx4 v[0:3], off, s[4:7], 0			; SI-NEXT: buffer_store_dwordx4 v[0:3], off, s[4:7], 0
	; SI-NEXT: s_waitcnt expcnt(0)			; SI-NEXT: s_waitcnt expcnt(0)
	; SI-NEXT: v_and_b32_e32 v0, 0xff, v4			; SI-NEXT: v_and_b32_e32 v0, 0xff, v7
	; SI-NEXT: v_add_i32_e32 v2, vcc, 9, v5			; SI-NEXT: v_add_i32_e32 v1, vcc, 9, v5
	; SI-NEXT: v_lshlrev_b32_e32 v1, 8, v6			; SI-NEXT: v_or_b32_e32 v0, v6, v0
	; SI-NEXT: v_or_b32_e32 v0, v7, v0			; SI-NEXT: v_and_b32_e32 v1, 0xff, v1
	; SI-NEXT: v_and_b32_e32 v2, 0xff, v2			; SI-NEXT: v_and_b32_e32 v4, 0xff000000, v4
	; SI-NEXT: v_add_i32_e32 v0, vcc, 0x900, v0			; SI-NEXT: v_add_i32_e32 v0, vcc, 0x900, v0
	; SI-NEXT: v_or_b32_e32 v1, v1, v2
	; SI-NEXT: v_and_b32_e32 v0, 0xffff, v0
	; SI-NEXT: v_lshlrev_b32_e32 v1, 16, v1			; SI-NEXT: v_lshlrev_b32_e32 v1, 16, v1
				; SI-NEXT: v_and_b32_e32 v0, 0xffff, v0
				; SI-NEXT: v_or_b32_e32 v1, v4, v1
	; SI-NEXT: v_or_b32_e32 v0, v1, v0			; SI-NEXT: v_or_b32_e32 v0, v1, v0
	; SI-NEXT: v_add_i32_e32 v0, vcc, 0x9000000, v0			; SI-NEXT: v_add_i32_e32 v0, vcc, 0x9000000, v0
	; SI-NEXT: buffer_store_dword v0, off, s[0:3], 0			; SI-NEXT: buffer_store_dword v0, off, s[0:3], 0
	; SI-NEXT: s_endpgm			; SI-NEXT: s_endpgm
	;			;
	; VI-LABEL: load_v4i8_to_v4f32_2_uses:			; VI-LABEL: load_v4i8_to_v4f32_2_uses:
	; VI: ; %bb.0:			; VI: ; %bb.0:
	; VI-NEXT: s_load_dwordx2 s[2:3], s[0:1], 0x34			; VI-NEXT: s_load_dwordx2 s[2:3], s[0:1], 0x34
	▲ Show 20 Lines • Show All 1,407 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll

Show First 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	; GFX11-NEXT: s_endpgm
ret void		ret void
}		}

; Should produce align 1 dword when legal		; Should produce align 1 dword when legal
define i32 @global_load_2xi16_align1(i16 addrspace(1)* %p) #0 {		define i32 @global_load_2xi16_align1(i16 addrspace(1)* %p) #0 {
; GFX7-ALIGNED-LABEL: global_load_2xi16_align1:		; GFX7-ALIGNED-LABEL: global_load_2xi16_align1:
; GFX7-ALIGNED: ; %bb.0:		; GFX7-ALIGNED: ; %bb.0:
; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)		; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GFX7-ALIGNED-NEXT: v_add_i32_e32 v2, vcc, 1, v0		; GFX7-ALIGNED-NEXT: v_add_i32_e32 v2, vcc, 2, v0
; GFX7-ALIGNED-NEXT: v_addc_u32_e32 v3, vcc, 0, v1, vcc
; GFX7-ALIGNED-NEXT: flat_load_ubyte v4, v[2:3]
; GFX7-ALIGNED-NEXT: v_add_i32_e32 v2, vcc, 3, v0
; GFX7-ALIGNED-NEXT: v_addc_u32_e32 v3, vcc, 0, v1, vcc		; GFX7-ALIGNED-NEXT: v_addc_u32_e32 v3, vcc, 0, v1, vcc
; GFX7-ALIGNED-NEXT: flat_load_ubyte v5, v[0:1]		; GFX7-ALIGNED-NEXT: v_add_i32_e32 v4, vcc, 1, v0
		; GFX7-ALIGNED-NEXT: v_addc_u32_e32 v5, vcc, 0, v1, vcc
		; GFX7-ALIGNED-NEXT: v_add_i32_e32 v6, vcc, 3, v0
		; GFX7-ALIGNED-NEXT: v_addc_u32_e32 v7, vcc, 0, v1, vcc
		; GFX7-ALIGNED-NEXT: flat_load_ubyte v6, v[6:7]
		; GFX7-ALIGNED-NEXT: flat_load_ubyte v4, v[4:5]
; GFX7-ALIGNED-NEXT: flat_load_ubyte v2, v[2:3]		; GFX7-ALIGNED-NEXT: flat_load_ubyte v2, v[2:3]
; GFX7-ALIGNED-NEXT: v_add_i32_e32 v0, vcc, 2, v0
; GFX7-ALIGNED-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
; GFX7-ALIGNED-NEXT: flat_load_ubyte v0, v[0:1]		; GFX7-ALIGNED-NEXT: flat_load_ubyte v0, v[0:1]
; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(3)		; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(3)
; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v1, 8, v4		; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v3, 24, v6
; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(2)		; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(2)
; GFX7-ALIGNED-NEXT: v_or_b32_e32 v1, v1, v5		; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v1, 8, v4
; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(1)		; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(1)
; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v2, 8, v2		; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v2, 16, v2
; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(0)		; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(0)
; GFX7-ALIGNED-NEXT: v_or_b32_e32 v0, v2, v0
; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v0, 16, v0
; GFX7-ALIGNED-NEXT: v_or_b32_e32 v0, v1, v0		; GFX7-ALIGNED-NEXT: v_or_b32_e32 v0, v1, v0
		; GFX7-ALIGNED-NEXT: v_or_b32_e32 v1, v3, v2
		; GFX7-ALIGNED-NEXT: v_or_b32_e32 v0, v0, v1
; GFX7-ALIGNED-NEXT: s_setpc_b64 s[30:31]		; GFX7-ALIGNED-NEXT: s_setpc_b64 s[30:31]
;		;
; GFX7-UNALIGNED-LABEL: global_load_2xi16_align1:		; GFX7-UNALIGNED-LABEL: global_load_2xi16_align1:
; GFX7-UNALIGNED: ; %bb.0:		; GFX7-UNALIGNED: ; %bb.0:
; GFX7-UNALIGNED-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)		; GFX7-UNALIGNED-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GFX7-UNALIGNED-NEXT: flat_load_dword v0, v[0:1]		; GFX7-UNALIGNED-NEXT: flat_load_dword v0, v[0:1]
; GFX7-UNALIGNED-NEXT: s_waitcnt vmcnt(0)		; GFX7-UNALIGNED-NEXT: s_waitcnt vmcnt(0)
; GFX7-UNALIGNED-NEXT: s_setpc_b64 s[30:31]		; GFX7-UNALIGNED-NEXT: s_setpc_b64 s[30:31]
▲ Show 20 Lines • Show All 248 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll

	Show First 20 Lines • Show All 202 Lines • ▼ Show 20 Lines
	; Should produce align 1 dword when legal			; Should produce align 1 dword when legal
	define i32 @private_load_2xi16_align1(i16 addrspace(5)* %p) #0 {			define i32 @private_load_2xi16_align1(i16 addrspace(5)* %p) #0 {
	; GFX7-ALIGNED-LABEL: private_load_2xi16_align1:			; GFX7-ALIGNED-LABEL: private_load_2xi16_align1:
	; GFX7-ALIGNED: ; %bb.0:			; GFX7-ALIGNED: ; %bb.0:
	; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)			; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	; GFX7-ALIGNED-NEXT: v_add_i32_e32 v1, vcc, 2, v0			; GFX7-ALIGNED-NEXT: v_add_i32_e32 v1, vcc, 2, v0
	; GFX7-ALIGNED-NEXT: v_add_i32_e32 v2, vcc, 1, v0			; GFX7-ALIGNED-NEXT: v_add_i32_e32 v2, vcc, 1, v0
	; GFX7-ALIGNED-NEXT: v_add_i32_e32 v3, vcc, 3, v0			; GFX7-ALIGNED-NEXT: v_add_i32_e32 v3, vcc, 3, v0
	; GFX7-ALIGNED-NEXT: buffer_load_ubyte v2, v2, s[0:3], 0 offen
	; GFX7-ALIGNED-NEXT: buffer_load_ubyte v3, v3, s[0:3], 0 offen			; GFX7-ALIGNED-NEXT: buffer_load_ubyte v3, v3, s[0:3], 0 offen
	; GFX7-ALIGNED-NEXT: buffer_load_ubyte v0, v0, s[0:3], 0 offen			; GFX7-ALIGNED-NEXT: buffer_load_ubyte v2, v2, s[0:3], 0 offen
	; GFX7-ALIGNED-NEXT: buffer_load_ubyte v1, v1, s[0:3], 0 offen			; GFX7-ALIGNED-NEXT: buffer_load_ubyte v1, v1, s[0:3], 0 offen
				; GFX7-ALIGNED-NEXT: buffer_load_ubyte v0, v0, s[0:3], 0 offen
	; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(3)			; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(3)
	; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v2, 8, v2			; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v3, 24, v3
	; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(2)			; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(2)
	; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v3, 8, v3			; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v2, 8, v2
	; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(1)			; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(1)
	; GFX7-ALIGNED-NEXT: v_or_b32_e32 v0, v2, v0			; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v1, 16, v1
	; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(0)			; GFX7-ALIGNED-NEXT: s_waitcnt vmcnt(0)
				; GFX7-ALIGNED-NEXT: v_or_b32_e32 v0, v2, v0
	; GFX7-ALIGNED-NEXT: v_or_b32_e32 v1, v3, v1			; GFX7-ALIGNED-NEXT: v_or_b32_e32 v1, v3, v1
	; GFX7-ALIGNED-NEXT: v_lshlrev_b32_e32 v1, 16, v1
	; GFX7-ALIGNED-NEXT: v_or_b32_e32 v0, v0, v1			; GFX7-ALIGNED-NEXT: v_or_b32_e32 v0, v0, v1
	; GFX7-ALIGNED-NEXT: s_setpc_b64 s[30:31]			; GFX7-ALIGNED-NEXT: s_setpc_b64 s[30:31]
	;			;
	; GFX7-UNALIGNED-LABEL: private_load_2xi16_align1:			; GFX7-UNALIGNED-LABEL: private_load_2xi16_align1:
	; GFX7-UNALIGNED: ; %bb.0:			; GFX7-UNALIGNED: ; %bb.0:
	; GFX7-UNALIGNED-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)			; GFX7-UNALIGNED-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	; GFX7-UNALIGNED-NEXT: buffer_load_dword v0, v0, s[0:3], 0 offen			; GFX7-UNALIGNED-NEXT: buffer_load_dword v0, v0, s[0:3], 0 offen
	; GFX7-UNALIGNED-NEXT: s_waitcnt vmcnt(0)			; GFX7-UNALIGNED-NEXT: s_waitcnt vmcnt(0)
	▲ Show 20 Lines • Show All 311 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/idot8s.ll

	Show First 20 Lines • Show All 2,812 Lines • ▼ Show 20 Lines
	; GFX7-NEXT: v_mov_b32_e32 v1, 0			; GFX7-NEXT: v_mov_b32_e32 v1, 0
	; GFX7-NEXT: buffer_load_dword v2, v[0:1], s[8:11], 0 addr64			; GFX7-NEXT: buffer_load_dword v2, v[0:1], s[8:11], 0 addr64
	; GFX7-NEXT: s_mov_b64 s[8:9], s[6:7]			; GFX7-NEXT: s_mov_b64 s[8:9], s[6:7]
	; GFX7-NEXT: buffer_load_dword v0, v[0:1], s[8:11], 0 addr64			; GFX7-NEXT: buffer_load_dword v0, v[0:1], s[8:11], 0 addr64
	; GFX7-NEXT: s_mov_b32 s2, -1			; GFX7-NEXT: s_mov_b32 s2, -1
	; GFX7-NEXT: buffer_load_ubyte v1, off, s[0:3], 0			; GFX7-NEXT: buffer_load_ubyte v1, off, s[0:3], 0
	; GFX7-NEXT: s_addc_u32 s13, s13, 0			; GFX7-NEXT: s_addc_u32 s13, s13, 0
	; GFX7-NEXT: s_waitcnt vmcnt(2)			; GFX7-NEXT: s_waitcnt vmcnt(2)
	; GFX7-NEXT: v_bfe_i32 v7, v2, 0, 4			; GFX7-NEXT: v_bfe_i32 v8, v2, 0, 4
	; GFX7-NEXT: v_bfe_i32 v3, v2, 24, 4			; GFX7-NEXT: v_ashrrev_i32_e32 v3, 28, v2
	; GFX7-NEXT: s_waitcnt vmcnt(1)			; GFX7-NEXT: s_waitcnt vmcnt(1)
	; GFX7-NEXT: v_bfe_i32 v14, v0, 0, 4			; GFX7-NEXT: v_bfe_i32 v15, v0, 0, 4
	; GFX7-NEXT: v_bfe_i32 v4, v2, 20, 4			; GFX7-NEXT: v_bfe_i32 v4, v2, 24, 4
	; GFX7-NEXT: v_bfe_i32 v5, v2, 16, 4			; GFX7-NEXT: v_bfe_i32 v5, v2, 20, 4
	; GFX7-NEXT: v_bfe_i32 v6, v2, 8, 4			; GFX7-NEXT: v_bfe_i32 v6, v2, 16, 4
	; GFX7-NEXT: v_ashrrev_i32_e32 v8, 28, v2			; GFX7-NEXT: v_bfe_i32 v7, v2, 8, 4
	; GFX7-NEXT: v_bfe_i32 v9, v2, 12, 4			; GFX7-NEXT: v_bfe_i32 v9, v2, 12, 4
	; GFX7-NEXT: v_bfe_i32 v2, v2, 4, 4			; GFX7-NEXT: v_bfe_i32 v2, v2, 4, 4
	; GFX7-NEXT: v_and_b32_e32 v7, 0xff, v7			; GFX7-NEXT: v_and_b32_e32 v8, 0xff, v8
	; GFX7-NEXT: v_bfe_i32 v10, v0, 24, 4			; GFX7-NEXT: v_ashrrev_i32_e32 v10, 28, v0
	; GFX7-NEXT: v_bfe_i32 v11, v0, 20, 4			; GFX7-NEXT: v_bfe_i32 v11, v0, 24, 4
	; GFX7-NEXT: v_bfe_i32 v12, v0, 16, 4			; GFX7-NEXT: v_bfe_i32 v12, v0, 20, 4
	; GFX7-NEXT: v_bfe_i32 v13, v0, 8, 4			; GFX7-NEXT: v_bfe_i32 v13, v0, 16, 4
	; GFX7-NEXT: v_ashrrev_i32_e32 v15, 28, v0			; GFX7-NEXT: v_bfe_i32 v14, v0, 8, 4
	; GFX7-NEXT: v_bfe_i32 v16, v0, 12, 4			; GFX7-NEXT: v_bfe_i32 v16, v0, 12, 4
	; GFX7-NEXT: v_bfe_i32 v0, v0, 4, 4			; GFX7-NEXT: v_bfe_i32 v0, v0, 4, 4
	; GFX7-NEXT: v_and_b32_e32 v14, 0xff, v14			; GFX7-NEXT: v_and_b32_e32 v15, 0xff, v15
	; GFX7-NEXT: v_and_b32_e32 v2, 0xff, v2			; GFX7-NEXT: v_and_b32_e32 v2, 0xff, v2
	; GFX7-NEXT: v_and_b32_e32 v0, 0xff, v0			; GFX7-NEXT: v_and_b32_e32 v0, 0xff, v0
	; GFX7-NEXT: s_waitcnt vmcnt(0)			; GFX7-NEXT: s_waitcnt vmcnt(0)
	; GFX7-NEXT: v_mad_u32_u24 v1, v7, v14, v1			; GFX7-NEXT: v_mad_u32_u24 v1, v8, v15, v1
	; GFX7-NEXT: v_and_b32_e32 v6, 0xff, v6			; GFX7-NEXT: v_and_b32_e32 v7, 0xff, v7
	; GFX7-NEXT: v_lshlrev_b32_e32 v9, 24, v9			; GFX7-NEXT: v_lshlrev_b32_e32 v9, 24, v9
	; GFX7-NEXT: v_and_b32_e32 v13, 0xff, v13			; GFX7-NEXT: v_and_b32_e32 v14, 0xff, v14
	; GFX7-NEXT: v_lshlrev_b32_e32 v16, 24, v16			; GFX7-NEXT: v_lshlrev_b32_e32 v16, 24, v16
	; GFX7-NEXT: v_mad_u32_u24 v0, v2, v0, v1			; GFX7-NEXT: v_mad_u32_u24 v0, v2, v0, v1
	; GFX7-NEXT: v_alignbit_b32 v9, 0, v9, 24			; GFX7-NEXT: v_alignbit_b32 v9, 0, v9, 24
	; GFX7-NEXT: v_alignbit_b32 v16, 0, v16, 24			; GFX7-NEXT: v_alignbit_b32 v16, 0, v16, 24
	; GFX7-NEXT: v_mad_u32_u24 v0, v6, v13, v0			; GFX7-NEXT: v_mad_u32_u24 v0, v7, v14, v0
				; GFX7-NEXT: v_and_b32_e32 v6, 0xff, v6
				; GFX7-NEXT: v_and_b32_e32 v13, 0xff, v13
				; GFX7-NEXT: v_mad_u32_u24 v0, v9, v16, v0
	; GFX7-NEXT: v_and_b32_e32 v5, 0xff, v5			; GFX7-NEXT: v_and_b32_e32 v5, 0xff, v5
	; GFX7-NEXT: v_and_b32_e32 v12, 0xff, v12			; GFX7-NEXT: v_and_b32_e32 v12, 0xff, v12
	; GFX7-NEXT: v_mad_u32_u24 v0, v9, v16, v0			; GFX7-NEXT: v_mad_u32_u24 v0, v6, v13, v0
	; GFX7-NEXT: v_and_b32_e32 v4, 0xff, v4			; GFX7-NEXT: v_and_b32_e32 v4, 0xff, v4
	; GFX7-NEXT: v_and_b32_e32 v11, 0xff, v11			; GFX7-NEXT: v_and_b32_e32 v11, 0xff, v11
	; GFX7-NEXT: v_mad_u32_u24 v0, v5, v12, v0			; GFX7-NEXT: v_mad_u32_u24 v0, v5, v12, v0
	; GFX7-NEXT: v_and_b32_e32 v3, 0xff, v3			; GFX7-NEXT: v_and_b32_e32 v3, 0xff, v3
	; GFX7-NEXT: v_and_b32_e32 v10, 0xff, v10			; GFX7-NEXT: v_and_b32_e32 v10, 0xff, v10
	; GFX7-NEXT: v_mad_u32_u24 v0, v4, v11, v0			; GFX7-NEXT: v_mad_u32_u24 v0, v4, v11, v0
	; GFX7-NEXT: v_and_b32_e32 v8, 0xff, v8
	; GFX7-NEXT: v_and_b32_e32 v15, 0xff, v15
	; GFX7-NEXT: v_mad_u32_u24 v0, v3, v10, v0			; GFX7-NEXT: v_mad_u32_u24 v0, v3, v10, v0
	; GFX7-NEXT: v_mad_u32_u24 v0, v8, v15, v0
	; GFX7-NEXT: buffer_store_byte v0, off, s[0:3], 0			; GFX7-NEXT: buffer_store_byte v0, off, s[0:3], 0
	; GFX7-NEXT: s_endpgm			; GFX7-NEXT: s_endpgm
	;			;
	; GFX8-LABEL: idot8_acc8_vecMul:			; GFX8-LABEL: idot8_acc8_vecMul:
	; GFX8: ; %bb.0: ; %entry			; GFX8: ; %bb.0: ; %entry
	; GFX8-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24			; GFX8-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24
	; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34			; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34
	; GFX8-NEXT: v_lshlrev_b32_e32 v2, 2, v0			; GFX8-NEXT: v_lshlrev_b32_e32 v2, 2, v0
	▲ Show 20 Lines • Show All 611 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/idot8u.ll

	Show First 20 Lines • Show All 2,438 Lines • ▼ Show 20 Lines
	; GFX7-NEXT: v_mov_b32_e32 v1, 0			; GFX7-NEXT: v_mov_b32_e32 v1, 0
	; GFX7-NEXT: buffer_load_dword v2, v[0:1], s[8:11], 0 addr64			; GFX7-NEXT: buffer_load_dword v2, v[0:1], s[8:11], 0 addr64
	; GFX7-NEXT: s_mov_b64 s[8:9], s[6:7]			; GFX7-NEXT: s_mov_b64 s[8:9], s[6:7]
	; GFX7-NEXT: buffer_load_dword v0, v[0:1], s[8:11], 0 addr64			; GFX7-NEXT: buffer_load_dword v0, v[0:1], s[8:11], 0 addr64
	; GFX7-NEXT: s_mov_b32 s2, -1			; GFX7-NEXT: s_mov_b32 s2, -1
	; GFX7-NEXT: buffer_load_ubyte v1, off, s[0:3], 0			; GFX7-NEXT: buffer_load_ubyte v1, off, s[0:3], 0
	; GFX7-NEXT: s_addc_u32 s13, s13, 0			; GFX7-NEXT: s_addc_u32 s13, s13, 0
	; GFX7-NEXT: s_waitcnt vmcnt(2)			; GFX7-NEXT: s_waitcnt vmcnt(2)
	; GFX7-NEXT: v_and_b32_e32 v7, 15, v2			; GFX7-NEXT: v_and_b32_e32 v8, 15, v2
	; GFX7-NEXT: v_bfe_u32 v6, v2, 4, 4			; GFX7-NEXT: v_bfe_u32 v3, v2, 24, 4
	; GFX7-NEXT: s_waitcnt vmcnt(1)			; GFX7-NEXT: s_waitcnt vmcnt(1)
	; GFX7-NEXT: v_and_b32_e32 v14, 15, v0			; GFX7-NEXT: v_and_b32_e32 v15, 15, v0
	; GFX7-NEXT: v_bfe_u32 v8, v2, 12, 4			; GFX7-NEXT: v_bfe_u32 v4, v2, 20, 4
	; GFX7-NEXT: v_bfe_u32 v13, v0, 4, 4			; GFX7-NEXT: v_bfe_u32 v5, v2, 16, 4
	; GFX7-NEXT: v_bfe_u32 v15, v0, 12, 4			; GFX7-NEXT: v_bfe_u32 v6, v2, 8, 4
				; GFX7-NEXT: v_bfe_u32 v7, v2, 4, 4
				; GFX7-NEXT: v_lshrrev_b32_e32 v9, 28, v2
				; GFX7-NEXT: v_lshlrev_b32_e32 v2, 12, v2
				; GFX7-NEXT: v_bfe_u32 v10, v0, 24, 4
				; GFX7-NEXT: v_bfe_u32 v11, v0, 20, 4
				; GFX7-NEXT: v_bfe_u32 v12, v0, 16, 4
				; GFX7-NEXT: v_bfe_u32 v13, v0, 8, 4
				; GFX7-NEXT: v_bfe_u32 v14, v0, 4, 4
				; GFX7-NEXT: v_lshrrev_b32_e32 v16, 28, v0
				; GFX7-NEXT: v_lshlrev_b32_e32 v0, 12, v0
	; GFX7-NEXT: s_waitcnt vmcnt(0)			; GFX7-NEXT: s_waitcnt vmcnt(0)
				; GFX7-NEXT: v_mad_u32_u24 v1, v8, v15, v1
				; GFX7-NEXT: v_and_b32_e32 v2, 0xf000000, v2
				; GFX7-NEXT: v_and_b32_e32 v0, 0xf000000, v0
	; GFX7-NEXT: v_mad_u32_u24 v1, v7, v14, v1			; GFX7-NEXT: v_mad_u32_u24 v1, v7, v14, v1
	; GFX7-NEXT: v_bfe_u32 v5, v2, 8, 4			; GFX7-NEXT: v_alignbit_b32 v2, s10, v2, 24
	; GFX7-NEXT: v_bfe_u32 v12, v0, 8, 4			; GFX7-NEXT: v_alignbit_b32 v0, 0, v0, 24
	; GFX7-NEXT: v_lshlrev_b32_e32 v8, 24, v8
	; GFX7-NEXT: v_lshlrev_b32_e32 v15, 24, v15
	; GFX7-NEXT: v_mad_u32_u24 v1, v6, v13, v1			; GFX7-NEXT: v_mad_u32_u24 v1, v6, v13, v1
	; GFX7-NEXT: v_alignbit_b32 v8, 0, v8, 24
	; GFX7-NEXT: v_alignbit_b32 v14, 0, v15, 24
	; GFX7-NEXT: v_mad_u32_u24 v1, v5, v12, v1
	; GFX7-NEXT: v_bfe_u32 v4, v2, 16, 4
	; GFX7-NEXT: v_lshrrev_b32_e32 v9, 28, v2
	; GFX7-NEXT: v_bfe_u32 v11, v0, 16, 4
	; GFX7-NEXT: v_lshrrev_b32_e32 v16, 28, v0
	; GFX7-NEXT: v_mad_u32_u24 v1, v8, v14, v1
	; GFX7-NEXT: v_bfe_u32 v3, v2, 20, 4
	; GFX7-NEXT: v_bfe_u32 v10, v0, 20, 4
	; GFX7-NEXT: v_alignbit_b32 v2, v9, v2, 24
	; GFX7-NEXT: v_alignbit_b32 v0, v16, v0, 24
	; GFX7-NEXT: v_mad_u32_u24 v1, v4, v11, v1
	; GFX7-NEXT: v_lshrrev_b32_e32 v9, 8, v2
	; GFX7-NEXT: v_and_b32_e32 v2, 15, v2
	; GFX7-NEXT: v_lshrrev_b32_e32 v7, 8, v0
	; GFX7-NEXT: v_and_b32_e32 v0, 15, v0
	; GFX7-NEXT: v_mad_u32_u24 v1, v3, v10, v1
	; GFX7-NEXT: v_mad_u32_u24 v0, v2, v0, v1			; GFX7-NEXT: v_mad_u32_u24 v0, v2, v0, v1
	; GFX7-NEXT: v_mad_u32_u24 v0, v9, v7, v0			; GFX7-NEXT: v_mad_u32_u24 v0, v5, v12, v0
				; GFX7-NEXT: v_mad_u32_u24 v0, v4, v11, v0
				; GFX7-NEXT: v_mad_u32_u24 v0, v3, v10, v0
				; GFX7-NEXT: v_mad_u32_u24 v0, v9, v16, v0
	; GFX7-NEXT: buffer_store_byte v0, off, s[0:3], 0			; GFX7-NEXT: buffer_store_byte v0, off, s[0:3], 0
	; GFX7-NEXT: s_endpgm			; GFX7-NEXT: s_endpgm
	;			;
	; GFX8-LABEL: udot8_acc8_vecMul:			; GFX8-LABEL: udot8_acc8_vecMul:
	; GFX8: ; %bb.0: ; %entry			; GFX8: ; %bb.0: ; %entry
	; GFX8-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24			; GFX8-NEXT: s_load_dwordx4 s[4:7], s[0:1], 0x24
	; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34			; GFX8-NEXT: s_load_dwordx2 s[0:1], s[0:1], 0x34
	; GFX8-NEXT: v_lshlrev_b32_e32 v2, 2, v0			; GFX8-NEXT: v_lshlrev_b32_e32 v2, 2, v0
	▲ Show 20 Lines • Show All 862 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll

	Show First 20 Lines • Show All 1,054 Lines • ▼ Show 20 Lines
	; GCN-NEXT: s_lshr_b32 s37, s7, 17			; GCN-NEXT: s_lshr_b32 s37, s7, 17
	; GCN-NEXT: s_lshr_b32 s38, s7, 18			; GCN-NEXT: s_lshr_b32 s38, s7, 18
	; GCN-NEXT: s_lshr_b32 s39, s7, 19			; GCN-NEXT: s_lshr_b32 s39, s7, 19
	; GCN-NEXT: s_lshr_b32 s40, s7, 20			; GCN-NEXT: s_lshr_b32 s40, s7, 20
	; GCN-NEXT: s_lshr_b32 s41, s7, 21			; GCN-NEXT: s_lshr_b32 s41, s7, 21
	; GCN-NEXT: s_lshr_b32 s42, s7, 22			; GCN-NEXT: s_lshr_b32 s42, s7, 22
	; GCN-NEXT: s_lshr_b32 s43, s7, 23			; GCN-NEXT: s_lshr_b32 s43, s7, 23
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x77			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x77
	; GCN-NEXT: v_mov_b32_e32 v16, s43			; GCN-NEXT: v_mov_b32_e32 v14, s43
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x76			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x76
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v14, 1, v14, vcc
	; GCN-NEXT: v_mov_b32_e32 v17, s42			; GCN-NEXT: v_mov_b32_e32 v17, s42
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v14, 3, v14
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x75			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x75
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v14, v14, v17
	; GCN-NEXT: v_mov_b32_e32 v17, s41			; GCN-NEXT: v_mov_b32_e32 v17, s41
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x74			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x74
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_mov_b32_e32 v18, s40			; GCN-NEXT: v_mov_b32_e32 v18, s40
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17			; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: v_and_b32_e32 v17, 3, v17			; GCN-NEXT: v_and_b32_e32 v17, 3, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x73			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x73
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v14, v17, v14
	; GCN-NEXT: v_mov_b32_e32 v17, s39			; GCN-NEXT: v_mov_b32_e32 v17, s39
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x72			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x72
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_mov_b32_e32 v18, s38			; GCN-NEXT: v_mov_b32_e32 v18, s38
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 3, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x71			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x71
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v17, v18
	; GCN-NEXT: v_mov_b32_e32 v18, s37			; GCN-NEXT: v_mov_b32_e32 v18, s37
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x70			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x70
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_mov_b32_e32 v19, s36			; GCN-NEXT: v_mov_b32_e32 v19, s36
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v19, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_and_b32_e32 v18, 3, v18			; GCN-NEXT: v_and_b32_e32 v18, 3, v18
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 4, v16			; GCN-NEXT: v_lshlrev_b16_e32 v14, 4, v14
	; GCN-NEXT: v_and_b32_e32 v17, 15, v17			; GCN-NEXT: v_and_b32_e32 v17, 15, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7f			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7f
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v14, v17, v14
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 7, s35			; GCN-NEXT: v_lshrrev_b16_e64 v17, 7, s35
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7e			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7e
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 6, s35			; GCN-NEXT: v_lshrrev_b16_e64 v18, 6, s35
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 3, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7d			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7d
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v17, v18
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 5, s35			; GCN-NEXT: v_lshrrev_b16_e64 v18, 5, s35
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7c			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7c
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 4, s35			; GCN-NEXT: v_lshrrev_b16_e64 v19, 4, s35
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v19, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_and_b32_e32 v18, 3, v18			; GCN-NEXT: v_and_b32_e32 v18, 3, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7b			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7b
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 3, s35			; GCN-NEXT: v_lshrrev_b16_e64 v18, 3, s35
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7a			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x7a
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 2, s35			; GCN-NEXT: v_lshrrev_b16_e64 v19, 2, s35
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x78
	; GCN-NEXT: v_mov_b32_e32 v14, s35
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
				; GCN-NEXT: s_cmpk_lg_i32 s0, 0x78
				; GCN-NEXT: v_mov_b32_e32 v12, s35
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 3, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v19, 2, v19
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x79			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x79
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v18, v19
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 1, s35			; GCN-NEXT: v_lshrrev_b16_e64 v19, 1, s35
	; GCN-NEXT: v_cndmask_b32_e32 v14, 1, v14, vcc			; GCN-NEXT: v_cndmask_b32_e32 v12, 1, v12, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_and_b32_e32 v14, 1, v14			; GCN-NEXT: v_and_b32_e32 v12, 1, v12
	; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19			; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v14, v14, v19			; GCN-NEXT: v_or_b32_e32 v12, v12, v19
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18			; GCN-NEXT: v_and_b32_e32 v12, 3, v12
	; GCN-NEXT: v_and_b32_e32 v14, 3, v14			; GCN-NEXT: v_or_b32_e32 v18, v12, v18
	; GCN-NEXT: v_or_b32_e32 v14, v14, v18			; GCN-NEXT: v_mov_b32_e32 v12, 15
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 4, v17			; GCN-NEXT: v_lshlrev_b16_e32 v17, 12, v17
	; GCN-NEXT: v_and_b32_e32 v14, 15, v14			; GCN-NEXT: v_and_b32_sdwa v18, v18, v12 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
	; GCN-NEXT: v_or_b32_sdwa v14, v14, v17 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD			; GCN-NEXT: v_or_b32_e32 v17, v17, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6f			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6f
	; GCN-NEXT: v_or_b32_sdwa v14, v16, v14 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v14, v14, v17 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 15, s7			; GCN-NEXT: v_lshrrev_b16_e64 v17, 15, s7
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6e			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6e
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 14, s7			; GCN-NEXT: v_lshrrev_b16_e64 v18, 14, s7
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6d
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 13, s7
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6c
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 12, s7
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: v_and_b32_e32 v17, 3, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6b
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 11, s7
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6a
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 10, s7
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x69			; GCN-NEXT: v_lshlrev_b16_e32 v17, 3, v17
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 9, s7			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6d
				; GCN-NEXT: v_or_b32_e32 v17, v17, v18
				; GCN-NEXT: v_lshrrev_b16_e64 v18, 13, s7
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x68			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6c
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 8, s7			; GCN-NEXT: v_lshrrev_b16_e64 v19, 12, s7
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v19, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_and_b32_e32 v18, 3, v18			; GCN-NEXT: v_and_b32_e32 v18, 3, v18
				; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6b
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 4, v16			; GCN-NEXT: v_lshrrev_b16_e64 v18, 11, s7
	; GCN-NEXT: v_and_b32_e32 v17, 15, v17			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
				; GCN-NEXT: s_cmpk_lg_i32 s0, 0x6a
				; GCN-NEXT: v_lshrrev_b16_e64 v19, 10, s7
				; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
				; GCN-NEXT: s_cselect_b64 vcc, -1, 0
				; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
				; GCN-NEXT: v_and_b32_e32 v19, 1, v19
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 3, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v19, 2, v19
				; GCN-NEXT: s_cmpk_lg_i32 s0, 0x69
				; GCN-NEXT: v_or_b32_e32 v18, v18, v19
				; GCN-NEXT: v_lshrrev_b16_e64 v19, 9, s7
				; GCN-NEXT: s_cselect_b64 vcc, -1, 0
				; GCN-NEXT: s_cmpk_lg_i32 s0, 0x68
				; GCN-NEXT: v_lshrrev_b16_e64 v16, 8, s7
				; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
				; GCN-NEXT: s_cselect_b64 vcc, -1, 0
				; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
				; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19
				; GCN-NEXT: v_and_b32_e32 v16, 1, v16
				; GCN-NEXT: v_or_b32_e32 v16, v16, v19
				; GCN-NEXT: v_and_b32_e32 v16, 3, v16
				; GCN-NEXT: v_or_b32_e32 v16, v16, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 12, v17
				; GCN-NEXT: v_and_b32_sdwa v16, v16, v12 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x67			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x67
	; GCN-NEXT: v_or_b32_sdwa v16, v17, v16 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 7, s7			; GCN-NEXT: v_lshrrev_b16_e64 v17, 7, s7
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x66			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x66
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 6, s7			; GCN-NEXT: v_lshrrev_b16_e64 v18, 6, s7
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 3, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x65			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x65
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v17, v18
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 5, s7			; GCN-NEXT: v_lshrrev_b16_e64 v18, 5, s7
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x64			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x64
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 4, s7			; GCN-NEXT: v_lshrrev_b16_e64 v19, 4, s7
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v19, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_and_b32_e32 v18, 3, v18			; GCN-NEXT: v_and_b32_e32 v18, 3, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x63			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x63
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 3, s7			; GCN-NEXT: v_lshrrev_b16_e64 v18, 3, s7
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x62			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x62
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 2, s7			; GCN-NEXT: v_lshrrev_b16_e64 v19, 2, s7
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 3, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v19, 2, v19
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x61			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x61
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v18, v19
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 1, s7			; GCN-NEXT: v_lshrrev_b16_e64 v19, 1, s7
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x60			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x60
	; GCN-NEXT: v_mov_b32_e32 v15, s7			; GCN-NEXT: v_mov_b32_e32 v15, s7
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19			; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19
	; GCN-NEXT: v_and_b32_e32 v15, 1, v15			; GCN-NEXT: v_and_b32_e32 v15, 1, v15
	; GCN-NEXT: v_or_b32_e32 v15, v15, v19			; GCN-NEXT: v_or_b32_e32 v15, v15, v19
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: v_and_b32_e32 v15, 3, v15			; GCN-NEXT: v_and_b32_e32 v15, 3, v15
	; GCN-NEXT: v_or_b32_e32 v15, v15, v18			; GCN-NEXT: v_or_b32_e32 v15, v15, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 4, v17			; GCN-NEXT: v_lshlrev_b16_e32 v17, 4, v17
	; GCN-NEXT: v_and_b32_e32 v15, 15, v15			; GCN-NEXT: v_and_b32_e32 v15, 15, v15
	; GCN-NEXT: v_or_b32_e32 v15, v15, v17			; GCN-NEXT: v_or_b32_e32 v15, v15, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x57			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x57
	; GCN-NEXT: v_or_b32_sdwa v15, v15, v16 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v15, v15, v16 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD
	; GCN-NEXT: v_mov_b32_e32 v16, s34			; GCN-NEXT: v_mov_b32_e32 v16, s34
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x56			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x56
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_mov_b32_e32 v17, s33			; GCN-NEXT: v_mov_b32_e32 v17, s33
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v16, 3, v16
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x55			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x55
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v16, v17
	; GCN-NEXT: v_mov_b32_e32 v17, s31			; GCN-NEXT: v_mov_b32_e32 v17, s31
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x54			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x54
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_mov_b32_e32 v18, s30			; GCN-NEXT: v_mov_b32_e32 v18, s30
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17			; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: v_and_b32_e32 v17, 3, v17			; GCN-NEXT: v_and_b32_e32 v17, 3, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x53			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x53
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_mov_b32_e32 v17, s29			; GCN-NEXT: v_mov_b32_e32 v17, s29
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x52			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x52
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_mov_b32_e32 v18, s28			; GCN-NEXT: v_mov_b32_e32 v18, s28
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 3, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x51			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x51
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v17, v18
	; GCN-NEXT: v_mov_b32_e32 v18, s27			; GCN-NEXT: v_mov_b32_e32 v18, s27
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x50			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x50
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_mov_b32_e32 v19, s26			; GCN-NEXT: v_mov_b32_e32 v19, s26
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v19, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_and_b32_e32 v18, 3, v18			; GCN-NEXT: v_and_b32_e32 v18, 3, v18
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 4, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 4, v16
	; GCN-NEXT: v_and_b32_e32 v17, 15, v17			; GCN-NEXT: v_and_b32_e32 v17, 15, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5f			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5f
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 7, s25			; GCN-NEXT: v_lshrrev_b16_e64 v17, 7, s25
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5e			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5e
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 6, s25			; GCN-NEXT: v_lshrrev_b16_e64 v18, 6, s25
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 3, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5d			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5d
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v17, v18
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 5, s25			; GCN-NEXT: v_lshrrev_b16_e64 v18, 5, s25
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5c			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5c
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 4, s25			; GCN-NEXT: v_lshrrev_b16_e64 v19, 4, s25
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v19, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_and_b32_e32 v18, 3, v18			; GCN-NEXT: v_and_b32_e32 v18, 3, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5b			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5b
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 3, s25			; GCN-NEXT: v_lshrrev_b16_e64 v18, 3, s25
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5a			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x5a
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 2, s25			; GCN-NEXT: v_lshrrev_b16_e64 v19, 2, s25
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
				; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x58			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x58
	; GCN-NEXT: v_mov_b32_e32 v3, s25			; GCN-NEXT: v_mov_b32_e32 v3, s25
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 3, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_lshlrev_b16_e32 v19, 2, v19
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x59			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x59
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v18, v19
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 1, s25			; GCN-NEXT: v_lshrrev_b16_e64 v19, 1, s25
	; GCN-NEXT: v_cndmask_b32_e32 v3, 1, v3, vcc			; GCN-NEXT: v_cndmask_b32_e32 v3, 1, v3, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_and_b32_e32 v3, 1, v3			; GCN-NEXT: v_and_b32_e32 v3, 1, v3
	; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19			; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v3, v3, v19			; GCN-NEXT: v_or_b32_e32 v3, v3, v19
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: v_and_b32_e32 v3, 3, v3			; GCN-NEXT: v_and_b32_e32 v3, 3, v3
	; GCN-NEXT: v_or_b32_e32 v3, v3, v18			; GCN-NEXT: v_or_b32_e32 v3, v3, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 4, v17			; GCN-NEXT: v_lshlrev_b16_e32 v17, 12, v17
	; GCN-NEXT: v_and_b32_e32 v3, 15, v3			; GCN-NEXT: v_and_b32_sdwa v3, v3, v12 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
	; GCN-NEXT: v_or_b32_sdwa v3, v3, v17 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD			; GCN-NEXT: v_or_b32_e32 v3, v17, v3
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4f			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4f
	; GCN-NEXT: v_or_b32_sdwa v16, v16, v3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v16, v16, v3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD
	; GCN-NEXT: v_lshrrev_b16_e64 v3, 15, s6			; GCN-NEXT: v_lshrrev_b16_e64 v3, 15, s6
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4e			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4e
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 14, s6			; GCN-NEXT: v_lshrrev_b16_e64 v17, 14, s6
	; GCN-NEXT: v_cndmask_b32_e32 v3, 1, v3, vcc			; GCN-NEXT: v_cndmask_b32_e32 v3, 1, v3, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v3, 1, v3
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v3, 3, v3
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4d			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4d
	; GCN-NEXT: v_or_b32_e32 v3, v17, v3			; GCN-NEXT: v_or_b32_e32 v3, v3, v17
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 13, s6			; GCN-NEXT: v_lshrrev_b16_e64 v17, 13, s6
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4c			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4c
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 12, s6			; GCN-NEXT: v_lshrrev_b16_e64 v18, 12, s6
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17			; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v3, 2, v3
	; GCN-NEXT: v_and_b32_e32 v17, 3, v17			; GCN-NEXT: v_and_b32_e32 v17, 3, v17
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4b			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4b
	; GCN-NEXT: v_or_b32_e32 v3, v17, v3			; GCN-NEXT: v_or_b32_e32 v3, v17, v3
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 11, s6			; GCN-NEXT: v_lshrrev_b16_e64 v17, 11, s6
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4a			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x4a
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 10, s6			; GCN-NEXT: v_lshrrev_b16_e64 v18, 10, s6
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 3, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x49			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x49
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v17, v18
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 9, s6			; GCN-NEXT: v_lshrrev_b16_e64 v18, 9, s6
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x48			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x48
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 8, s6			; GCN-NEXT: v_lshrrev_b16_e64 v19, 8, s6
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v19, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_and_b32_e32 v18, 3, v18			; GCN-NEXT: v_and_b32_e32 v18, 3, v18
	; GCN-NEXT: v_or_b32_e32 v17, v18, v17			; GCN-NEXT: v_or_b32_e32 v17, v18, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v3, 4, v3			; GCN-NEXT: v_lshlrev_b16_e32 v3, 12, v3
	; GCN-NEXT: v_and_b32_e32 v17, 15, v17			; GCN-NEXT: v_and_b32_sdwa v17, v17, v12 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x47			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x47
	; GCN-NEXT: v_or_b32_sdwa v17, v17, v3 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD			; GCN-NEXT: v_or_b32_e32 v17, v3, v17
	; GCN-NEXT: v_lshrrev_b16_e64 v3, 7, s6			; GCN-NEXT: v_lshrrev_b16_e64 v3, 7, s6
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x46			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x46
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 6, s6			; GCN-NEXT: v_lshrrev_b16_e64 v18, 6, s6
	; GCN-NEXT: v_cndmask_b32_e32 v3, 1, v3, vcc			; GCN-NEXT: v_cndmask_b32_e32 v3, 1, v3, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v3, 1, v3
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
				; GCN-NEXT: v_lshlrev_b16_e32 v3, 3, v3
				; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x45			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x45
	; GCN-NEXT: v_or_b32_e32 v3, v18, v3			; GCN-NEXT: v_or_b32_e32 v3, v3, v18
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 5, s6			; GCN-NEXT: v_lshrrev_b16_e64 v18, 5, s6
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x44			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x44
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 4, s6			; GCN-NEXT: v_lshrrev_b16_e64 v19, 4, s6
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
	; GCN-NEXT: v_or_b32_e32 v18, v19, v18			; GCN-NEXT: v_or_b32_e32 v18, v19, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v3, 2, v3
	; GCN-NEXT: v_and_b32_e32 v18, 3, v18			; GCN-NEXT: v_and_b32_e32 v18, 3, v18
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x43			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x43
	; GCN-NEXT: v_or_b32_e32 v18, v18, v3			; GCN-NEXT: v_or_b32_e32 v18, v18, v3
	; GCN-NEXT: v_lshrrev_b16_e64 v3, 3, s6			; GCN-NEXT: v_lshrrev_b16_e64 v3, 3, s6
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x42			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x42
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 2, s6			; GCN-NEXT: v_lshrrev_b16_e64 v19, 2, s6
	; GCN-NEXT: v_cndmask_b32_e32 v3, 1, v3, vcc			; GCN-NEXT: v_cndmask_b32_e32 v3, 1, v3, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v3, 1, v3
	; GCN-NEXT: v_and_b32_e32 v19, 1, v19			; GCN-NEXT: v_and_b32_e32 v19, 1, v19
				; GCN-NEXT: v_lshlrev_b16_e32 v3, 3, v3
				; GCN-NEXT: v_lshlrev_b16_e32 v19, 2, v19
	; GCN-NEXT: s_cmpk_lg_i32 s0, 0x41			; GCN-NEXT: s_cmpk_lg_i32 s0, 0x41
	; GCN-NEXT: v_or_b32_e32 v3, v19, v3			; GCN-NEXT: v_or_b32_e32 v3, v3, v19
	; GCN-NEXT: v_lshrrev_b16_e64 v19, 1, s6			; GCN-NEXT: v_lshrrev_b16_e64 v19, 1, s6
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 64			; GCN-NEXT: s_cmp_lg_u32 s0, 64
	; GCN-NEXT: v_mov_b32_e32 v2, s6			; GCN-NEXT: v_mov_b32_e32 v2, s6
	; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc			; GCN-NEXT: v_cndmask_b32_e32 v19, 1, v19, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v2, 1, v2, vcc			; GCN-NEXT: v_cndmask_b32_e32 v2, 1, v2, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19			; GCN-NEXT: v_lshlrev_b16_e32 v19, 1, v19
	; GCN-NEXT: v_and_b32_e32 v2, 1, v2			; GCN-NEXT: v_and_b32_e32 v2, 1, v2
	; GCN-NEXT: v_or_b32_e32 v2, v2, v19			; GCN-NEXT: v_or_b32_e32 v2, v2, v19
	; GCN-NEXT: v_lshlrev_b16_e32 v3, 2, v3
	; GCN-NEXT: v_and_b32_e32 v2, 3, v2			; GCN-NEXT: v_and_b32_e32 v2, 3, v2
	; GCN-NEXT: v_or_b32_e32 v2, v2, v3			; GCN-NEXT: v_or_b32_e32 v2, v2, v3
	; GCN-NEXT: v_or_b32_sdwa v3, v15, v14 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v3, v15, v14 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD
	; GCN-NEXT: v_lshlrev_b16_e32 v14, 4, v18			; GCN-NEXT: v_lshlrev_b16_e32 v14, 4, v18
	; GCN-NEXT: v_and_b32_e32 v2, 15, v2			; GCN-NEXT: v_and_b32_e32 v2, 15, v2
	; GCN-NEXT: s_cmp_lg_u32 s0, 55			; GCN-NEXT: s_cmp_lg_u32 s0, 55
	; GCN-NEXT: v_or_b32_e32 v2, v2, v14			; GCN-NEXT: v_or_b32_e32 v2, v2, v14
	; GCN-NEXT: v_mov_b32_e32 v14, s24			; GCN-NEXT: v_mov_b32_e32 v14, s24
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 54			; GCN-NEXT: s_cmp_lg_u32 s0, 54
	; GCN-NEXT: v_cndmask_b32_e32 v14, 1, v14, vcc			; GCN-NEXT: v_cndmask_b32_e32 v14, 1, v14, vcc
	; GCN-NEXT: v_mov_b32_e32 v15, s23			; GCN-NEXT: v_mov_b32_e32 v15, s23
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v14, 1, v14
	; GCN-NEXT: v_and_b32_e32 v15, 1, v15			; GCN-NEXT: v_and_b32_e32 v15, 1, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v14, 3, v14
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: s_cmp_lg_u32 s0, 53			; GCN-NEXT: s_cmp_lg_u32 s0, 53
	; GCN-NEXT: v_or_b32_sdwa v2, v2, v17 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v2, v2, v17 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD
	; GCN-NEXT: v_or_b32_e32 v14, v15, v14			; GCN-NEXT: v_or_b32_e32 v14, v14, v15
	; GCN-NEXT: v_mov_b32_e32 v15, s22			; GCN-NEXT: v_mov_b32_e32 v15, s22
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 52			; GCN-NEXT: s_cmp_lg_u32 s0, 52
	; GCN-NEXT: v_or_b32_sdwa v2, v2, v16 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v2, v2, v16 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: v_mov_b32_e32 v16, s21			; GCN-NEXT: v_mov_b32_e32 v16, s21
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15			; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_lshlrev_b16_e32 v14, 2, v14
	; GCN-NEXT: v_and_b32_e32 v15, 3, v15			; GCN-NEXT: v_and_b32_e32 v15, 3, v15
	; GCN-NEXT: s_cmp_lg_u32 s0, 51			; GCN-NEXT: s_cmp_lg_u32 s0, 51
	; GCN-NEXT: v_or_b32_e32 v14, v15, v14			; GCN-NEXT: v_or_b32_e32 v14, v15, v14
	; GCN-NEXT: v_mov_b32_e32 v15, s20			; GCN-NEXT: v_mov_b32_e32 v15, s20
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 50			; GCN-NEXT: s_cmp_lg_u32 s0, 50
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: v_mov_b32_e32 v16, s19			; GCN-NEXT: v_mov_b32_e32 v16, s19
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 3, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 49			; GCN-NEXT: s_cmp_lg_u32 s0, 49
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v15, v16
	; GCN-NEXT: v_mov_b32_e32 v16, s18			; GCN-NEXT: v_mov_b32_e32 v16, s18
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 48			; GCN-NEXT: s_cmp_lg_u32 s0, 48
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_mov_b32_e32 v17, s17			; GCN-NEXT: v_mov_b32_e32 v17, s17
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: v_and_b32_e32 v16, 3, v16			; GCN-NEXT: v_and_b32_e32 v16, 3, v16
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_lshlrev_b16_e32 v14, 4, v14			; GCN-NEXT: v_lshlrev_b16_e32 v14, 4, v14
	; GCN-NEXT: v_and_b32_e32 v15, 15, v15			; GCN-NEXT: v_and_b32_e32 v15, 15, v15
	; GCN-NEXT: s_cmp_lg_u32 s0, 63			; GCN-NEXT: s_cmp_lg_u32 s0, 63
	; GCN-NEXT: v_or_b32_e32 v14, v15, v14			; GCN-NEXT: v_or_b32_e32 v14, v15, v14
	; GCN-NEXT: v_lshrrev_b16_e64 v15, 7, s16			; GCN-NEXT: v_lshrrev_b16_e64 v15, 7, s16
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 62			; GCN-NEXT: s_cmp_lg_u32 s0, 62
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 6, s16			; GCN-NEXT: v_lshrrev_b16_e64 v16, 6, s16
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 3, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 61			; GCN-NEXT: s_cmp_lg_u32 s0, 61
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v15, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 5, s16			; GCN-NEXT: v_lshrrev_b16_e64 v16, 5, s16
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 60			; GCN-NEXT: s_cmp_lg_u32 s0, 60
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 4, s16			; GCN-NEXT: v_lshrrev_b16_e64 v17, 4, s16
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: v_and_b32_e32 v16, 3, v16			; GCN-NEXT: v_and_b32_e32 v16, 3, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 59			; GCN-NEXT: s_cmp_lg_u32 s0, 59
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 3, s16			; GCN-NEXT: v_lshrrev_b16_e64 v16, 3, s16
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 58			; GCN-NEXT: s_cmp_lg_u32 s0, 58
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 2, s16			; GCN-NEXT: v_lshrrev_b16_e64 v17, 2, s16
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
				; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: s_cmp_lg_u32 s0, 56			; GCN-NEXT: s_cmp_lg_u32 s0, 56
	; GCN-NEXT: v_mov_b32_e32 v13, s16			; GCN-NEXT: v_mov_b32_e32 v13, s16
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 3, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 57			; GCN-NEXT: s_cmp_lg_u32 s0, 57
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v16, v17
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 1, s16			; GCN-NEXT: v_lshrrev_b16_e64 v17, 1, s16
	; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc			; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_and_b32_e32 v13, 1, v13			; GCN-NEXT: v_and_b32_e32 v13, 1, v13
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17			; GCN-NEXT: v_lshlrev_b16_e32 v17, 1, v17
	; GCN-NEXT: v_or_b32_e32 v13, v13, v17			; GCN-NEXT: v_or_b32_e32 v13, v13, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: v_and_b32_e32 v13, 3, v13			; GCN-NEXT: v_and_b32_e32 v13, 3, v13
	; GCN-NEXT: v_or_b32_e32 v13, v13, v16			; GCN-NEXT: v_or_b32_e32 v13, v13, v16
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 4, v15			; GCN-NEXT: v_lshlrev_b16_e32 v15, 12, v15
	; GCN-NEXT: v_and_b32_e32 v13, 15, v13			; GCN-NEXT: v_and_b32_sdwa v13, v13, v12 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
	; GCN-NEXT: v_or_b32_sdwa v13, v13, v15 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD			; GCN-NEXT: v_or_b32_e32 v13, v15, v13
	; GCN-NEXT: s_cmp_lg_u32 s0, 47			; GCN-NEXT: s_cmp_lg_u32 s0, 47
	; GCN-NEXT: v_or_b32_sdwa v14, v14, v13 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v14, v14, v13 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD
	; GCN-NEXT: v_lshrrev_b16_e64 v13, 15, s5			; GCN-NEXT: v_lshrrev_b16_e64 v13, 15, s5
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 46			; GCN-NEXT: s_cmp_lg_u32 s0, 46
	; GCN-NEXT: v_lshrrev_b16_e64 v15, 14, s5			; GCN-NEXT: v_lshrrev_b16_e64 v15, 14, s5
	; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc			; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v13, 1, v13
	; GCN-NEXT: v_and_b32_e32 v15, 1, v15			; GCN-NEXT: v_and_b32_e32 v15, 1, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v13, 3, v13
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: s_cmp_lg_u32 s0, 45			; GCN-NEXT: s_cmp_lg_u32 s0, 45
	; GCN-NEXT: v_or_b32_e32 v13, v15, v13			; GCN-NEXT: v_or_b32_e32 v13, v13, v15
	; GCN-NEXT: v_lshrrev_b16_e64 v15, 13, s5			; GCN-NEXT: v_lshrrev_b16_e64 v15, 13, s5
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 44			; GCN-NEXT: s_cmp_lg_u32 s0, 44
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 12, s5			; GCN-NEXT: v_lshrrev_b16_e64 v16, 12, s5
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15			; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_lshlrev_b16_e32 v13, 2, v13
	; GCN-NEXT: v_and_b32_e32 v15, 3, v15			; GCN-NEXT: v_and_b32_e32 v15, 3, v15
	; GCN-NEXT: s_cmp_lg_u32 s0, 43			; GCN-NEXT: s_cmp_lg_u32 s0, 43
	; GCN-NEXT: v_or_b32_e32 v13, v15, v13			; GCN-NEXT: v_or_b32_e32 v13, v15, v13
	; GCN-NEXT: v_lshrrev_b16_e64 v15, 11, s5			; GCN-NEXT: v_lshrrev_b16_e64 v15, 11, s5
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 42			; GCN-NEXT: s_cmp_lg_u32 s0, 42
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 10, s5			; GCN-NEXT: v_lshrrev_b16_e64 v16, 10, s5
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 3, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 41			; GCN-NEXT: s_cmp_lg_u32 s0, 41
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v15, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 9, s5			; GCN-NEXT: v_lshrrev_b16_e64 v16, 9, s5
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 40			; GCN-NEXT: s_cmp_lg_u32 s0, 40
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 8, s5			; GCN-NEXT: v_lshrrev_b16_e64 v17, 8, s5
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: v_and_b32_e32 v16, 3, v16			; GCN-NEXT: v_and_b32_e32 v16, 3, v16
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_lshlrev_b16_e32 v13, 4, v13			; GCN-NEXT: v_lshlrev_b16_e32 v13, 12, v13
	; GCN-NEXT: v_and_b32_e32 v15, 15, v15			; GCN-NEXT: v_and_b32_sdwa v15, v15, v12 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
	; GCN-NEXT: s_cmp_lg_u32 s0, 39			; GCN-NEXT: s_cmp_lg_u32 s0, 39
	; GCN-NEXT: v_or_b32_sdwa v15, v15, v13 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD			; GCN-NEXT: v_or_b32_e32 v15, v13, v15
	; GCN-NEXT: v_lshrrev_b16_e64 v13, 7, s5			; GCN-NEXT: v_lshrrev_b16_e64 v13, 7, s5
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 38			; GCN-NEXT: s_cmp_lg_u32 s0, 38
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 6, s5			; GCN-NEXT: v_lshrrev_b16_e64 v16, 6, s5
	; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc			; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v13, 1, v13
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
				; GCN-NEXT: v_lshlrev_b16_e32 v13, 3, v13
				; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 37			; GCN-NEXT: s_cmp_lg_u32 s0, 37
	; GCN-NEXT: v_or_b32_e32 v13, v16, v13			; GCN-NEXT: v_or_b32_e32 v13, v13, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 5, s5			; GCN-NEXT: v_lshrrev_b16_e64 v16, 5, s5
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 36			; GCN-NEXT: s_cmp_lg_u32 s0, 36
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 4, s5			; GCN-NEXT: v_lshrrev_b16_e64 v17, 4, s5
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshlrev_b16_e32 v13, 2, v13
	; GCN-NEXT: v_and_b32_e32 v16, 3, v16			; GCN-NEXT: v_and_b32_e32 v16, 3, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 35			; GCN-NEXT: s_cmp_lg_u32 s0, 35
	; GCN-NEXT: v_or_b32_e32 v16, v16, v13			; GCN-NEXT: v_or_b32_e32 v16, v16, v13
	; GCN-NEXT: v_lshrrev_b16_e64 v13, 3, s5			; GCN-NEXT: v_lshrrev_b16_e64 v13, 3, s5
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 34			; GCN-NEXT: s_cmp_lg_u32 s0, 34
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 2, s5			; GCN-NEXT: v_lshrrev_b16_e64 v17, 2, s5
	; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc			; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v13, 1, v13
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
				; GCN-NEXT: v_lshlrev_b16_e32 v13, 3, v13
				; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: s_cmp_lg_u32 s0, 33			; GCN-NEXT: s_cmp_lg_u32 s0, 33
	; GCN-NEXT: v_or_b32_e32 v17, v17, v13			; GCN-NEXT: v_or_b32_e32 v17, v13, v17
	; GCN-NEXT: v_lshrrev_b16_e64 v13, 1, s5			; GCN-NEXT: v_lshrrev_b16_e64 v13, 1, s5
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 32			; GCN-NEXT: s_cmp_lg_u32 s0, 32
	; GCN-NEXT: v_mov_b32_e32 v1, s5			; GCN-NEXT: v_mov_b32_e32 v1, s5
	; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc			; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v1, 1, v1, vcc			; GCN-NEXT: v_cndmask_b32_e32 v1, 1, v1, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v13, 1, v13			; GCN-NEXT: v_lshlrev_b16_e32 v13, 1, v13
	; GCN-NEXT: v_and_b32_e32 v1, 1, v1			; GCN-NEXT: v_and_b32_e32 v1, 1, v1
	; GCN-NEXT: v_or_b32_e32 v1, v1, v13			; GCN-NEXT: v_or_b32_e32 v1, v1, v13
	; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_and_b32_e32 v1, 3, v1			; GCN-NEXT: v_and_b32_e32 v1, 3, v1
	; GCN-NEXT: v_or_b32_e32 v1, v1, v17			; GCN-NEXT: v_or_b32_e32 v1, v1, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 4, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 4, v16
	; GCN-NEXT: v_and_b32_e32 v1, 15, v1			; GCN-NEXT: v_and_b32_e32 v1, 15, v1
	; GCN-NEXT: v_or_b32_e32 v1, v1, v16			; GCN-NEXT: v_or_b32_e32 v1, v1, v16
	; GCN-NEXT: v_or_b32_sdwa v1, v1, v15 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v1, v1, v15 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD
	; GCN-NEXT: s_cmp_lg_u32 s0, 23			; GCN-NEXT: s_cmp_lg_u32 s0, 23
	; GCN-NEXT: v_or_b32_sdwa v1, v1, v14 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v1, v1, v14 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD
	; GCN-NEXT: v_mov_b32_e32 v14, s15			; GCN-NEXT: v_mov_b32_e32 v14, s15
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 22			; GCN-NEXT: s_cmp_lg_u32 s0, 22
	; GCN-NEXT: v_cndmask_b32_e32 v14, 1, v14, vcc			; GCN-NEXT: v_cndmask_b32_e32 v14, 1, v14, vcc
	; GCN-NEXT: v_mov_b32_e32 v15, s14			; GCN-NEXT: v_mov_b32_e32 v15, s14
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v14, 1, v14
	; GCN-NEXT: v_and_b32_e32 v15, 1, v15			; GCN-NEXT: v_and_b32_e32 v15, 1, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v14, 3, v14
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: s_cmp_lg_u32 s0, 21			; GCN-NEXT: s_cmp_lg_u32 s0, 21
	; GCN-NEXT: v_or_b32_e32 v14, v15, v14			; GCN-NEXT: v_or_b32_e32 v14, v14, v15
	; GCN-NEXT: v_mov_b32_e32 v15, s13			; GCN-NEXT: v_mov_b32_e32 v15, s13
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 20			; GCN-NEXT: s_cmp_lg_u32 s0, 20
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: v_mov_b32_e32 v16, s12			; GCN-NEXT: v_mov_b32_e32 v16, s12
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15			; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_lshlrev_b16_e32 v14, 2, v14
	; GCN-NEXT: v_and_b32_e32 v15, 3, v15			; GCN-NEXT: v_and_b32_e32 v15, 3, v15
	; GCN-NEXT: s_cmp_lg_u32 s0, 19			; GCN-NEXT: s_cmp_lg_u32 s0, 19
	; GCN-NEXT: v_or_b32_e32 v14, v15, v14			; GCN-NEXT: v_or_b32_e32 v14, v15, v14
	; GCN-NEXT: v_mov_b32_e32 v15, s11			; GCN-NEXT: v_mov_b32_e32 v15, s11
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 18			; GCN-NEXT: s_cmp_lg_u32 s0, 18
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: v_mov_b32_e32 v16, s10			; GCN-NEXT: v_mov_b32_e32 v16, s10
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 3, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 17			; GCN-NEXT: s_cmp_lg_u32 s0, 17
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v15, v16
	; GCN-NEXT: v_mov_b32_e32 v16, s9			; GCN-NEXT: v_mov_b32_e32 v16, s9
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 16			; GCN-NEXT: s_cmp_lg_u32 s0, 16
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_mov_b32_e32 v18, s8			; GCN-NEXT: v_mov_b32_e32 v18, s8
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
	; GCN-NEXT: v_or_b32_e32 v16, v18, v16			; GCN-NEXT: v_or_b32_e32 v16, v18, v16
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: v_and_b32_e32 v16, 3, v16			; GCN-NEXT: v_and_b32_e32 v16, 3, v16
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_lshlrev_b16_e32 v14, 4, v14			; GCN-NEXT: v_lshlrev_b16_e32 v14, 4, v14
	; GCN-NEXT: v_and_b32_e32 v15, 15, v15			; GCN-NEXT: v_and_b32_e32 v15, 15, v15
	; GCN-NEXT: s_cmp_lg_u32 s0, 31			; GCN-NEXT: s_cmp_lg_u32 s0, 31
	; GCN-NEXT: v_or_b32_e32 v14, v15, v14			; GCN-NEXT: v_or_b32_e32 v14, v15, v14
	; GCN-NEXT: v_lshrrev_b16_e64 v15, 7, s1			; GCN-NEXT: v_lshrrev_b16_e64 v15, 7, s1
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 30			; GCN-NEXT: s_cmp_lg_u32 s0, 30
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 6, s1			; GCN-NEXT: v_lshrrev_b16_e64 v16, 6, s1
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 3, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 29			; GCN-NEXT: s_cmp_lg_u32 s0, 29
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v15, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 5, s1			; GCN-NEXT: v_lshrrev_b16_e64 v16, 5, s1
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 28			; GCN-NEXT: s_cmp_lg_u32 s0, 28
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 4, s1			; GCN-NEXT: v_lshrrev_b16_e64 v18, 4, s1
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_and_b32_e32 v18, 1, v18
	; GCN-NEXT: v_or_b32_e32 v16, v18, v16			; GCN-NEXT: v_or_b32_e32 v16, v18, v16
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: v_and_b32_e32 v16, 3, v16			; GCN-NEXT: v_and_b32_e32 v16, 3, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 27			; GCN-NEXT: s_cmp_lg_u32 s0, 27
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 3, s1			; GCN-NEXT: v_lshrrev_b16_e64 v16, 3, s1
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 26			; GCN-NEXT: s_cmp_lg_u32 s0, 26
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 2, s1			; GCN-NEXT: v_lshrrev_b16_e64 v18, 2, s1
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
				; GCN-NEXT: v_and_b32_e32 v18, 1, v18
	; GCN-NEXT: s_cmp_lg_u32 s0, 24			; GCN-NEXT: s_cmp_lg_u32 s0, 24
	; GCN-NEXT: v_mov_b32_e32 v17, s1			; GCN-NEXT: v_mov_b32_e32 v17, s1
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 3, v16
	; GCN-NEXT: v_and_b32_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 2, v18
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 25			; GCN-NEXT: s_cmp_lg_u32 s0, 25
	; GCN-NEXT: v_or_b32_e32 v16, v18, v16			; GCN-NEXT: v_or_b32_e32 v16, v16, v18
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 1, s1			; GCN-NEXT: v_lshrrev_b16_e64 v18, 1, s1
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v18, 1, v18, vcc
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18			; GCN-NEXT: v_lshlrev_b16_e32 v18, 1, v18
	; GCN-NEXT: v_or_b32_e32 v17, v17, v18			; GCN-NEXT: v_or_b32_e32 v17, v17, v18
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: v_and_b32_e32 v17, 3, v17			; GCN-NEXT: v_and_b32_e32 v17, 3, v17
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 4, v15			; GCN-NEXT: v_lshlrev_b16_e32 v15, 12, v15
	; GCN-NEXT: v_and_b32_e32 v16, 15, v16			; GCN-NEXT: v_and_b32_sdwa v16, v16, v12 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
	; GCN-NEXT: v_or_b32_sdwa v15, v16, v15 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD			; GCN-NEXT: v_or_b32_e32 v15, v15, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 15			; GCN-NEXT: s_cmp_lg_u32 s0, 15
	; GCN-NEXT: v_or_b32_sdwa v14, v14, v15 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v14, v14, v15 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD
	; GCN-NEXT: v_lshrrev_b16_e64 v15, 15, s4			; GCN-NEXT: v_lshrrev_b16_e64 v15, 15, s4
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 14			; GCN-NEXT: s_cmp_lg_u32 s0, 14
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 14, s4			; GCN-NEXT: v_lshrrev_b16_e64 v16, 14, s4
	; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc			; GCN-NEXT: v_cndmask_b32_e32 v15, 1, v15, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 1, v15
	; GCN-NEXT: v_and_b32_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v16, 1, v16
				; GCN-NEXT: v_lshlrev_b16_e32 v15, 3, v15
				; GCN-NEXT: v_lshlrev_b16_e32 v16, 2, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 13			; GCN-NEXT: s_cmp_lg_u32 s0, 13
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v15, v16
	; GCN-NEXT: v_lshrrev_b16_e64 v16, 13, s4			; GCN-NEXT: v_lshrrev_b16_e64 v16, 13, s4
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 12			; GCN-NEXT: s_cmp_lg_u32 s0, 12
	; GCN-NEXT: v_lshrrev_b16_e64 v17, 12, s4			; GCN-NEXT: v_lshrrev_b16_e64 v17, 12, s4
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v16, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v17, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16
	; GCN-NEXT: v_and_b32_e32 v17, 1, v17			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: v_or_b32_e32 v16, v17, v16			; GCN-NEXT: v_or_b32_e32 v16, v17, v16
	; GCN-NEXT: s_cmp_lg_u32 s0, 11			; GCN-NEXT: s_cmp_lg_u32 s0, 11
	; GCN-NEXT: v_lshrrev_b16_e64 v18, 11, s4			; GCN-NEXT: v_lshrrev_b16_e64 v17, 11, s4
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 2, v15
	; GCN-NEXT: v_and_b32_e32 v16, 3, v16			; GCN-NEXT: v_and_b32_e32 v16, 3, v16
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 10			; GCN-NEXT: s_cmp_lg_u32 s0, 10
	; GCN-NEXT: v_lshrrev_b16_e64 v13, 10, s4			; GCN-NEXT: v_lshrrev_b16_e64 v18, 10, s4
	; GCN-NEXT: v_or_b32_e32 v15, v16, v15			; GCN-NEXT: v_or_b32_e32 v15, v16, v15
	; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v18, vcc			; GCN-NEXT: v_cndmask_b32_e32 v16, 1, v17, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 9			; GCN-NEXT: s_cmp_lg_u32 s0, 9
	; GCN-NEXT: v_lshrrev_b16_e64 v12, 9, s4			; GCN-NEXT: v_lshrrev_b16_e64 v13, 9, s4
	; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc			; GCN-NEXT: v_cndmask_b32_e32 v17, 1, v18, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 8			; GCN-NEXT: s_cmp_lg_u32 s0, 8
	; GCN-NEXT: v_lshrrev_b16_e64 v11, 8, s4			; GCN-NEXT: v_lshrrev_b16_e64 v11, 8, s4
	; GCN-NEXT: v_cndmask_b32_e32 v12, 1, v12, vcc			; GCN-NEXT: v_cndmask_b32_e32 v13, 1, v13, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 7			; GCN-NEXT: s_cmp_lg_u32 s0, 7
	; GCN-NEXT: v_lshrrev_b16_e64 v10, 7, s4			; GCN-NEXT: v_lshrrev_b16_e64 v10, 7, s4
	; GCN-NEXT: v_cndmask_b32_e32 v11, 1, v11, vcc			; GCN-NEXT: v_cndmask_b32_e32 v11, 1, v11, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 6			; GCN-NEXT: s_cmp_lg_u32 s0, 6
	; GCN-NEXT: v_lshrrev_b16_e64 v9, 6, s4			; GCN-NEXT: v_lshrrev_b16_e64 v9, 6, s4
	; GCN-NEXT: v_cndmask_b32_e32 v10, 1, v10, vcc			; GCN-NEXT: v_cndmask_b32_e32 v10, 1, v10, vcc
	Show All 18 Lines
	; GCN-NEXT: v_lshrrev_b16_e64 v4, 1, s4			; GCN-NEXT: v_lshrrev_b16_e64 v4, 1, s4
	; GCN-NEXT: v_cndmask_b32_e32 v5, 1, v5, vcc			; GCN-NEXT: v_cndmask_b32_e32 v5, 1, v5, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: s_cmp_lg_u32 s0, 0			; GCN-NEXT: s_cmp_lg_u32 s0, 0
	; GCN-NEXT: v_mov_b32_e32 v0, s4			; GCN-NEXT: v_mov_b32_e32 v0, s4
	; GCN-NEXT: v_cndmask_b32_e32 v4, 1, v4, vcc			; GCN-NEXT: v_cndmask_b32_e32 v4, 1, v4, vcc
	; GCN-NEXT: s_cselect_b64 vcc, -1, 0			; GCN-NEXT: s_cselect_b64 vcc, -1, 0
	; GCN-NEXT: v_cndmask_b32_e32 v0, 1, v0, vcc			; GCN-NEXT: v_cndmask_b32_e32 v0, 1, v0, vcc
	; GCN-NEXT: v_lshlrev_b16_e32 v16, 1, v16			; GCN-NEXT: v_and_b32_e32 v17, 1, v17
	; GCN-NEXT: v_and_b32_e32 v13, 1, v13			; GCN-NEXT: v_lshlrev_b16_e32 v13, 1, v13
	; GCN-NEXT: v_lshlrev_b16_e32 v12, 1, v12
	; GCN-NEXT: v_and_b32_e32 v11, 1, v11			; GCN-NEXT: v_and_b32_e32 v11, 1, v11
	; GCN-NEXT: v_lshlrev_b16_e32 v10, 1, v10
	; GCN-NEXT: v_and_b32_e32 v9, 1, v9			; GCN-NEXT: v_and_b32_e32 v9, 1, v9
	; GCN-NEXT: v_lshlrev_b16_e32 v8, 1, v8			; GCN-NEXT: v_lshlrev_b16_e32 v8, 1, v8
	; GCN-NEXT: v_and_b32_e32 v7, 1, v7			; GCN-NEXT: v_and_b32_e32 v7, 1, v7
	; GCN-NEXT: v_lshlrev_b16_e32 v6, 1, v6
	; GCN-NEXT: v_and_b32_e32 v5, 1, v5			; GCN-NEXT: v_and_b32_e32 v5, 1, v5
	; GCN-NEXT: v_lshlrev_b16_e32 v4, 1, v4			; GCN-NEXT: v_lshlrev_b16_e32 v4, 1, v4
	; GCN-NEXT: v_and_b32_e32 v0, 1, v0			; GCN-NEXT: v_and_b32_e32 v0, 1, v0
	; GCN-NEXT: v_or_b32_e32 v13, v13, v16			; GCN-NEXT: v_lshlrev_b16_e32 v16, 3, v16
	; GCN-NEXT: v_or_b32_e32 v11, v11, v12			; GCN-NEXT: v_lshlrev_b16_e32 v17, 2, v17
	; GCN-NEXT: v_or_b32_e32 v9, v9, v10			; GCN-NEXT: v_or_b32_e32 v11, v11, v13
				; GCN-NEXT: v_lshlrev_b16_e32 v10, 3, v10
				; GCN-NEXT: v_lshlrev_b16_e32 v9, 2, v9
	; GCN-NEXT: v_or_b32_e32 v7, v7, v8			; GCN-NEXT: v_or_b32_e32 v7, v7, v8
	; GCN-NEXT: v_or_b32_e32 v5, v5, v6			; GCN-NEXT: v_lshlrev_b16_e32 v6, 3, v6
				; GCN-NEXT: v_lshlrev_b16_e32 v5, 2, v5
	; GCN-NEXT: v_or_b32_e32 v0, v0, v4			; GCN-NEXT: v_or_b32_e32 v0, v0, v4
	; GCN-NEXT: v_lshlrev_b16_e32 v13, 2, v13			; GCN-NEXT: v_or_b32_e32 v16, v16, v17
	; GCN-NEXT: v_and_b32_e32 v11, 3, v11			; GCN-NEXT: v_and_b32_e32 v11, 3, v11
	; GCN-NEXT: v_lshlrev_b16_e32 v9, 2, v9			; GCN-NEXT: v_or_b32_e32 v9, v10, v9
	; GCN-NEXT: v_and_b32_e32 v7, 3, v7			; GCN-NEXT: v_and_b32_e32 v7, 3, v7
	; GCN-NEXT: v_lshlrev_b16_e32 v5, 2, v5			; GCN-NEXT: v_or_b32_e32 v5, v6, v5
	; GCN-NEXT: v_and_b32_e32 v0, 3, v0			; GCN-NEXT: v_and_b32_e32 v0, 3, v0
	; GCN-NEXT: v_or_b32_e32 v11, v11, v13			; GCN-NEXT: v_or_b32_e32 v11, v11, v16
	; GCN-NEXT: v_or_b32_e32 v7, v7, v9			; GCN-NEXT: v_or_b32_e32 v7, v7, v9
	; GCN-NEXT: v_or_b32_e32 v0, v0, v5			; GCN-NEXT: v_or_b32_e32 v0, v0, v5
	; GCN-NEXT: v_lshlrev_b16_e32 v15, 4, v15			; GCN-NEXT: v_lshlrev_b16_e32 v15, 12, v15
	; GCN-NEXT: v_and_b32_e32 v11, 15, v11			; GCN-NEXT: v_and_b32_sdwa v11, v11, v12 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
	; GCN-NEXT: v_lshlrev_b16_e32 v7, 4, v7			; GCN-NEXT: v_lshlrev_b16_e32 v7, 4, v7
	; GCN-NEXT: v_and_b32_e32 v0, 15, v0			; GCN-NEXT: v_and_b32_e32 v0, 15, v0
	; GCN-NEXT: v_or_b32_sdwa v11, v11, v15 dst_sel:BYTE_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD			; GCN-NEXT: v_or_b32_e32 v11, v15, v11
	; GCN-NEXT: v_or_b32_e32 v0, v0, v7			; GCN-NEXT: v_or_b32_e32 v0, v0, v7
	; GCN-NEXT: v_or_b32_sdwa v0, v0, v11 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v0, v0, v11 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:BYTE_0 src1_sel:DWORD
	; GCN-NEXT: v_mov_b32_e32 v5, s3			; GCN-NEXT: v_mov_b32_e32 v5, s3
	; GCN-NEXT: v_or_b32_sdwa v0, v0, v14 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD			; GCN-NEXT: v_or_b32_sdwa v0, v0, v14 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:DWORD
	; GCN-NEXT: v_mov_b32_e32 v4, s2			; GCN-NEXT: v_mov_b32_e32 v4, s2
	; GCN-NEXT: flat_store_dwordx4 v[4:5], v[0:3]			; GCN-NEXT: flat_store_dwordx4 v[4:5], v[0:3]
	; GCN-NEXT: s_endpgm			; GCN-NEXT: s_endpgm
	entry:			entry:
	▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll

	Show First 20 Lines • Show All 1,207 Lines • ▼ Show 20 Lines
	; SI-NEXT: s_load_dword s6, s[4:5], 0x8			; SI-NEXT: s_load_dword s6, s[4:5], 0x8
	; SI-NEXT: s_load_dwordx2 s[0:1], s[4:5], 0x0			; SI-NEXT: s_load_dwordx2 s[0:1], s[4:5], 0x0
	; SI-NEXT: s_mov_b32 s3, 0x100f000			; SI-NEXT: s_mov_b32 s3, 0x100f000
	; SI-NEXT: s_mov_b32 s2, -1			; SI-NEXT: s_mov_b32 s2, -1
	; SI-NEXT: s_waitcnt lgkmcnt(0)			; SI-NEXT: s_waitcnt lgkmcnt(0)
	; SI-NEXT: s_lshr_b32 s4, s11, 24			; SI-NEXT: s_lshr_b32 s4, s11, 24
	; SI-NEXT: s_cmp_lg_u32 s6, 15			; SI-NEXT: s_cmp_lg_u32 s6, 15
	; SI-NEXT: s_cselect_b32 s4, s4, 5			; SI-NEXT: s_cselect_b32 s4, s4, 5
	; SI-NEXT: s_lshl_b32 s4, s4, 8			; SI-NEXT: s_lshl_b32 s4, s4, 24
	; SI-NEXT: s_lshr_b32 s5, s11, 16			; SI-NEXT: s_lshr_b32 s5, s11, 16
	; SI-NEXT: s_cmp_lg_u32 s6, 14			; SI-NEXT: s_cmp_lg_u32 s6, 14
	; SI-NEXT: s_cselect_b32 s5, s5, 5			; SI-NEXT: s_cselect_b32 s5, s5, 5
	; SI-NEXT: s_and_b32 s5, s5, 0xff			; SI-NEXT: s_and_b32 s5, s5, 0xff
	; SI-NEXT: s_or_b32 s4, s5, s4			; SI-NEXT: s_lshl_b32 s5, s5, 16
	; SI-NEXT: s_lshl_b32 s4, s4, 16			; SI-NEXT: s_or_b32 s4, s4, s5
	; SI-NEXT: s_lshr_b32 s5, s11, 8			; SI-NEXT: s_lshr_b32 s5, s11, 8
	; SI-NEXT: s_cmp_lg_u32 s6, 13			; SI-NEXT: s_cmp_lg_u32 s6, 13
	; SI-NEXT: s_cselect_b32 s5, s5, 5			; SI-NEXT: s_cselect_b32 s5, s5, 5
	; SI-NEXT: s_lshl_b32 s5, s5, 8			; SI-NEXT: s_lshl_b32 s5, s5, 8
	; SI-NEXT: s_cmp_lg_u32 s6, 12			; SI-NEXT: s_cmp_lg_u32 s6, 12
	; SI-NEXT: s_cselect_b32 s7, s11, 5			; SI-NEXT: s_cselect_b32 s7, s11, 5
	; SI-NEXT: s_and_b32 s7, s7, 0xff			; SI-NEXT: s_and_b32 s7, s7, 0xff
	; SI-NEXT: s_or_b32 s5, s7, s5			; SI-NEXT: s_or_b32 s5, s7, s5
	; SI-NEXT: s_and_b32 s5, s5, 0xffff			; SI-NEXT: s_and_b32 s5, s5, 0xffff
	; SI-NEXT: s_or_b32 s4, s5, s4			; SI-NEXT: s_or_b32 s4, s5, s4
	; SI-NEXT: s_lshr_b32 s5, s10, 24			; SI-NEXT: s_lshr_b32 s5, s10, 24
	; SI-NEXT: s_cmp_lg_u32 s6, 11			; SI-NEXT: s_cmp_lg_u32 s6, 11
	; SI-NEXT: s_cselect_b32 s5, s5, 5			; SI-NEXT: s_cselect_b32 s5, s5, 5
	; SI-NEXT: s_lshl_b32 s5, s5, 8			; SI-NEXT: s_lshl_b32 s5, s5, 24
	; SI-NEXT: s_lshr_b32 s7, s10, 16			; SI-NEXT: s_lshr_b32 s7, s10, 16
	; SI-NEXT: s_cmp_lg_u32 s6, 10			; SI-NEXT: s_cmp_lg_u32 s6, 10
	; SI-NEXT: s_cselect_b32 s7, s7, 5			; SI-NEXT: s_cselect_b32 s7, s7, 5
	; SI-NEXT: s_and_b32 s7, s7, 0xff			; SI-NEXT: s_and_b32 s7, s7, 0xff
	; SI-NEXT: s_or_b32 s5, s7, s5			; SI-NEXT: s_lshl_b32 s7, s7, 16
	; SI-NEXT: s_lshl_b32 s5, s5, 16			; SI-NEXT: s_or_b32 s5, s5, s7
	; SI-NEXT: s_lshr_b32 s7, s10, 8			; SI-NEXT: s_lshr_b32 s7, s10, 8
	; SI-NEXT: s_cmp_lg_u32 s6, 9			; SI-NEXT: s_cmp_lg_u32 s6, 9
	; SI-NEXT: s_cselect_b32 s7, s7, 5			; SI-NEXT: s_cselect_b32 s7, s7, 5
	; SI-NEXT: s_lshl_b32 s7, s7, 8			; SI-NEXT: s_lshl_b32 s7, s7, 8
	; SI-NEXT: s_cmp_lg_u32 s6, 8			; SI-NEXT: s_cmp_lg_u32 s6, 8
	; SI-NEXT: s_cselect_b32 s10, s10, 5			; SI-NEXT: s_cselect_b32 s10, s10, 5
	; SI-NEXT: s_and_b32 s10, s10, 0xff			; SI-NEXT: s_and_b32 s10, s10, 0xff
	; SI-NEXT: s_or_b32 s7, s10, s7			; SI-NEXT: s_or_b32 s7, s10, s7
	; SI-NEXT: s_and_b32 s7, s7, 0xffff			; SI-NEXT: s_and_b32 s7, s7, 0xffff
	; SI-NEXT: s_or_b32 s5, s7, s5			; SI-NEXT: s_or_b32 s5, s7, s5
	; SI-NEXT: s_lshr_b32 s7, s9, 24			; SI-NEXT: s_lshr_b32 s7, s9, 24
	; SI-NEXT: s_cmp_lg_u32 s6, 7			; SI-NEXT: s_cmp_lg_u32 s6, 7
	; SI-NEXT: s_cselect_b32 s7, s7, 5			; SI-NEXT: s_cselect_b32 s7, s7, 5
	; SI-NEXT: s_lshl_b32 s7, s7, 8			; SI-NEXT: s_lshl_b32 s7, s7, 24
	; SI-NEXT: s_lshr_b32 s10, s9, 16			; SI-NEXT: s_lshr_b32 s10, s9, 16
	; SI-NEXT: s_cmp_lg_u32 s6, 6			; SI-NEXT: s_cmp_lg_u32 s6, 6
	; SI-NEXT: s_cselect_b32 s10, s10, 5			; SI-NEXT: s_cselect_b32 s10, s10, 5
	; SI-NEXT: s_and_b32 s10, s10, 0xff			; SI-NEXT: s_and_b32 s10, s10, 0xff
	; SI-NEXT: s_or_b32 s7, s10, s7			; SI-NEXT: s_lshl_b32 s10, s10, 16
	; SI-NEXT: s_lshl_b32 s7, s7, 16			; SI-NEXT: s_or_b32 s7, s7, s10
	; SI-NEXT: s_lshr_b32 s10, s9, 8			; SI-NEXT: s_lshr_b32 s10, s9, 8
	; SI-NEXT: s_cmp_lg_u32 s6, 5			; SI-NEXT: s_cmp_lg_u32 s6, 5
	; SI-NEXT: s_cselect_b32 s10, s10, 5			; SI-NEXT: s_cselect_b32 s10, s10, 5
	; SI-NEXT: s_lshl_b32 s10, s10, 8			; SI-NEXT: s_lshl_b32 s10, s10, 8
	; SI-NEXT: s_cmp_lg_u32 s6, 4			; SI-NEXT: s_cmp_lg_u32 s6, 4
	; SI-NEXT: s_cselect_b32 s9, s9, 5			; SI-NEXT: s_cselect_b32 s9, s9, 5
	; SI-NEXT: s_and_b32 s9, s9, 0xff			; SI-NEXT: s_and_b32 s9, s9, 0xff
	; SI-NEXT: s_or_b32 s9, s9, s10			; SI-NEXT: s_or_b32 s9, s9, s10
	; SI-NEXT: s_and_b32 s9, s9, 0xffff			; SI-NEXT: s_and_b32 s9, s9, 0xffff
	; SI-NEXT: s_or_b32 s7, s9, s7			; SI-NEXT: s_or_b32 s7, s9, s7
	; SI-NEXT: s_lshr_b32 s9, s8, 24			; SI-NEXT: s_lshr_b32 s9, s8, 24
	; SI-NEXT: s_cmp_lg_u32 s6, 3			; SI-NEXT: s_cmp_lg_u32 s6, 3
	; SI-NEXT: s_cselect_b32 s9, s9, 5			; SI-NEXT: s_cselect_b32 s9, s9, 5
	; SI-NEXT: s_lshl_b32 s9, s9, 8			; SI-NEXT: s_lshl_b32 s9, s9, 24
	; SI-NEXT: s_lshr_b32 s10, s8, 16			; SI-NEXT: s_lshr_b32 s10, s8, 16
	; SI-NEXT: s_cmp_lg_u32 s6, 2			; SI-NEXT: s_cmp_lg_u32 s6, 2
	; SI-NEXT: s_cselect_b32 s10, s10, 5			; SI-NEXT: s_cselect_b32 s10, s10, 5
	; SI-NEXT: s_and_b32 s10, s10, 0xff			; SI-NEXT: s_and_b32 s10, s10, 0xff
	; SI-NEXT: s_or_b32 s9, s10, s9			; SI-NEXT: s_lshl_b32 s10, s10, 16
	; SI-NEXT: s_lshl_b32 s9, s9, 16			; SI-NEXT: s_or_b32 s9, s9, s10
	; SI-NEXT: s_lshr_b32 s10, s8, 8			; SI-NEXT: s_lshr_b32 s10, s8, 8
	; SI-NEXT: s_cmp_lg_u32 s6, 1			; SI-NEXT: s_cmp_lg_u32 s6, 1
	; SI-NEXT: s_cselect_b32 s10, s10, 5			; SI-NEXT: s_cselect_b32 s10, s10, 5
	; SI-NEXT: s_lshl_b32 s10, s10, 8			; SI-NEXT: s_lshl_b32 s10, s10, 8
	; SI-NEXT: s_cmp_lg_u32 s6, 0			; SI-NEXT: s_cmp_lg_u32 s6, 0
	; SI-NEXT: s_cselect_b32 s6, s8, 5			; SI-NEXT: s_cselect_b32 s6, s8, 5
	; SI-NEXT: s_and_b32 s6, s6, 0xff			; SI-NEXT: s_and_b32 s6, s6, 0xff
	; SI-NEXT: s_or_b32 s6, s6, s10			; SI-NEXT: s_or_b32 s6, s6, s10
	▲ Show 20 Lines • Show All 500 Lines • Show Last 20 Lines

llvm/test/CodeGen/BPF/pr57872.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=bpf-- \| FileCheck %s			; RUN: llc < %s -mtriple=bpf-- \| FileCheck %s
	; XFAIL: *

	%struct.event = type { i8, [84 x i8] }			%struct.event = type { i8, [84 x i8] }

	define void @foo(ptr %g) {			define void @foo(ptr %g) {
				; CHECK-LABEL: foo:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: r1 = (u64 )(r1 + 0)
				; CHECK-NEXT: r2 = (u8 )(r1 + 83)
				; CHECK-NEXT: (u8 )(r10 - 4) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 82)
				; CHECK-NEXT: (u8 )(r10 - 5) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 81)
				; CHECK-NEXT: (u8 )(r10 - 6) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 80)
				; CHECK-NEXT: (u8 )(r10 - 7) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 79)
				; CHECK-NEXT: (u8 )(r10 - 8) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 78)
				; CHECK-NEXT: (u8 )(r10 - 9) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 77)
				; CHECK-NEXT: (u8 )(r10 - 10) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 76)
				; CHECK-NEXT: (u8 )(r10 - 11) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 75)
				; CHECK-NEXT: (u8 )(r10 - 12) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 74)
				; CHECK-NEXT: (u8 )(r10 - 13) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 73)
				; CHECK-NEXT: (u8 )(r10 - 14) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 72)
				; CHECK-NEXT: (u8 )(r10 - 15) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 71)
				; CHECK-NEXT: (u8 )(r10 - 16) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 70)
				; CHECK-NEXT: (u8 )(r10 - 17) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 69)
				; CHECK-NEXT: (u8 )(r10 - 18) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 68)
				; CHECK-NEXT: (u8 )(r10 - 19) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 67)
				; CHECK-NEXT: (u8 )(r10 - 20) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 66)
				; CHECK-NEXT: (u8 )(r10 - 21) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 65)
				; CHECK-NEXT: (u8 )(r10 - 22) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 64)
				; CHECK-NEXT: (u8 )(r10 - 23) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 63)
				; CHECK-NEXT: (u8 )(r10 - 24) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 62)
				; CHECK-NEXT: (u8 )(r10 - 25) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 61)
				; CHECK-NEXT: (u8 )(r10 - 26) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 60)
				; CHECK-NEXT: (u8 )(r10 - 27) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 59)
				; CHECK-NEXT: (u8 )(r10 - 28) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 58)
				; CHECK-NEXT: (u8 )(r10 - 29) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 57)
				; CHECK-NEXT: (u8 )(r10 - 30) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 56)
				; CHECK-NEXT: (u8 )(r10 - 31) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 55)
				; CHECK-NEXT: (u8 )(r10 - 32) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 54)
				; CHECK-NEXT: (u8 )(r10 - 33) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 53)
				; CHECK-NEXT: (u8 )(r10 - 34) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 52)
				; CHECK-NEXT: (u8 )(r10 - 35) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 51)
				; CHECK-NEXT: (u8 )(r10 - 36) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 50)
				; CHECK-NEXT: (u8 )(r10 - 37) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 49)
				; CHECK-NEXT: (u8 )(r10 - 38) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 48)
				; CHECK-NEXT: (u8 )(r10 - 39) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 47)
				; CHECK-NEXT: (u8 )(r10 - 40) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 46)
				; CHECK-NEXT: (u8 )(r10 - 41) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 45)
				; CHECK-NEXT: (u8 )(r10 - 42) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 44)
				; CHECK-NEXT: (u8 )(r10 - 43) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 43)
				; CHECK-NEXT: (u8 )(r10 - 44) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 42)
				; CHECK-NEXT: (u8 )(r10 - 45) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 41)
				; CHECK-NEXT: (u8 )(r10 - 46) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 40)
				; CHECK-NEXT: (u8 )(r10 - 47) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 39)
				; CHECK-NEXT: (u8 )(r10 - 48) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 38)
				; CHECK-NEXT: (u8 )(r10 - 49) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 37)
				; CHECK-NEXT: (u8 )(r10 - 50) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 36)
				; CHECK-NEXT: (u8 )(r10 - 51) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 35)
				; CHECK-NEXT: (u8 )(r10 - 52) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 34)
				; CHECK-NEXT: (u8 )(r10 - 53) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 33)
				; CHECK-NEXT: (u8 )(r10 - 54) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 32)
				; CHECK-NEXT: (u8 )(r10 - 55) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 31)
				; CHECK-NEXT: (u8 )(r10 - 56) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 30)
				; CHECK-NEXT: (u8 )(r10 - 57) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 29)
				; CHECK-NEXT: (u8 )(r10 - 58) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 28)
				; CHECK-NEXT: (u8 )(r10 - 59) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 27)
				; CHECK-NEXT: (u8 )(r10 - 60) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 26)
				; CHECK-NEXT: (u8 )(r10 - 61) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 25)
				; CHECK-NEXT: (u8 )(r10 - 62) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 24)
				; CHECK-NEXT: (u8 )(r10 - 63) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 23)
				; CHECK-NEXT: (u8 )(r10 - 64) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 22)
				; CHECK-NEXT: (u8 )(r10 - 65) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 21)
				; CHECK-NEXT: (u8 )(r10 - 66) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 20)
				; CHECK-NEXT: (u8 )(r10 - 67) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 19)
				; CHECK-NEXT: (u8 )(r10 - 68) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 18)
				; CHECK-NEXT: (u8 )(r10 - 69) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 17)
				; CHECK-NEXT: (u8 )(r10 - 70) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 16)
				; CHECK-NEXT: (u8 )(r10 - 71) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 15)
				; CHECK-NEXT: (u8 )(r10 - 72) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 14)
				; CHECK-NEXT: (u8 )(r10 - 73) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 13)
				; CHECK-NEXT: (u8 )(r10 - 74) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 12)
				; CHECK-NEXT: (u8 )(r10 - 75) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 11)
				; CHECK-NEXT: (u8 )(r10 - 76) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 10)
				; CHECK-NEXT: (u8 )(r10 - 77) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 9)
				; CHECK-NEXT: (u8 )(r10 - 78) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 8)
				; CHECK-NEXT: (u8 )(r10 - 79) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 7)
				; CHECK-NEXT: (u8 )(r10 - 80) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 6)
				; CHECK-NEXT: (u8 )(r10 - 81) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 5)
				; CHECK-NEXT: (u8 )(r10 - 82) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 4)
				; CHECK-NEXT: (u8 )(r10 - 83) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 3)
				; CHECK-NEXT: (u8 )(r10 - 84) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 2)
				; CHECK-NEXT: (u8 )(r10 - 85) = r2
				; CHECK-NEXT: r2 = (u8 )(r1 + 1)
				; CHECK-NEXT: (u8 )(r10 - 86) = r2
				; CHECK-NEXT: r1 = (u8 )(r1 + 0)
				; CHECK-NEXT: (u8 )(r10 - 87) = r1
				; CHECK-NEXT: r1 = r10
				; CHECK-NEXT: r1 += -88
				; CHECK-NEXT: call bar
				; CHECK-NEXT: exit
	entry:			entry:
	%event = alloca %struct.event, align 1			%event = alloca %struct.event, align 1
	%hostname = getelementptr inbounds %struct.event, ptr %event, i64 0, i32 1			%hostname = getelementptr inbounds %struct.event, ptr %event, i64 0, i32 1
	%0 = load ptr, ptr %g, align 8			%0 = load ptr, ptr %g, align 8
	call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(84) %hostname, ptr noundef nonnull align 1 dereferenceable(84) %0, i64 84, i1 false)			call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(84) %hostname, ptr noundef nonnull align 1 dereferenceable(84) %0, i64 84, i1 false)
	call void @bar(ptr noundef nonnull %event)			call void @bar(ptr noundef nonnull %event)
	ret void			ret void
	}			}

	declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg) #2			declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg) #2
	declare void @bar(ptr noundef)			declare void @bar(ptr noundef)

llvm/test/CodeGen/Mips/cconv/return-struct.ll

	Show First 20 Lines • Show All 169 Lines • ▼ Show 20 Lines
	; O32-LE-NEXT: lhu $3, 4($1)			; O32-LE-NEXT: lhu $3, 4($1)
	; O32-LE-NEXT: jr $ra			; O32-LE-NEXT: jr $ra
	; O32-LE-NEXT: nop			; O32-LE-NEXT: nop
	;			;
	; N32-BE-LABEL: ret_struct_3xi16:			; N32-BE-LABEL: ret_struct_3xi16:
	; N32-BE: # %bb.0: # %entry			; N32-BE: # %bb.0: # %entry
	; N32-BE-NEXT: lui $1, %hi(struct_3xi16)			; N32-BE-NEXT: lui $1, %hi(struct_3xi16)
	; N32-BE-NEXT: lw $2, %lo(struct_3xi16)($1)			; N32-BE-NEXT: lw $2, %lo(struct_3xi16)($1)
	; N32-BE-NEXT: dsll $2, $2, 16			; N32-BE-NEXT: dsll $2, $2, 32
	; N32-BE-NEXT: addiu $1, $1, %lo(struct_3xi16)			; N32-BE-NEXT: addiu $1, $1, %lo(struct_3xi16)
	; N32-BE-NEXT: lhu $1, 4($1)			; N32-BE-NEXT: lhu $1, 4($1)
	; N32-BE-NEXT: or $1, $1, $2			; N32-BE-NEXT: dsll $1, $1, 16
	; N32-BE-NEXT: jr $ra			; N32-BE-NEXT: jr $ra
	; N32-BE-NEXT: dsll $2, $1, 16			; N32-BE-NEXT: or $2, $2, $1
	;			;
	; N32-LE-LABEL: ret_struct_3xi16:			; N32-LE-LABEL: ret_struct_3xi16:
	; N32-LE: # %bb.0: # %entry			; N32-LE: # %bb.0: # %entry
	; N32-LE-NEXT: lui $1, %hi(struct_3xi16)			; N32-LE-NEXT: lui $1, %hi(struct_3xi16)
	; N32-LE-NEXT: lwu $2, %lo(struct_3xi16)($1)			; N32-LE-NEXT: lwu $2, %lo(struct_3xi16)($1)
	; N32-LE-NEXT: addiu $1, $1, %lo(struct_3xi16)			; N32-LE-NEXT: addiu $1, $1, %lo(struct_3xi16)
	; N32-LE-NEXT: lh $1, 4($1)			; N32-LE-NEXT: lh $1, 4($1)
	; N32-LE-NEXT: dsll $1, $1, 32			; N32-LE-NEXT: dsll $1, $1, 32
	; N32-LE-NEXT: jr $ra			; N32-LE-NEXT: jr $ra
	; N32-LE-NEXT: or $2, $2, $1			; N32-LE-NEXT: or $2, $2, $1
	;			;
	; N64-BE-LABEL: ret_struct_3xi16:			; N64-BE-LABEL: ret_struct_3xi16:
	; N64-BE: # %bb.0: # %entry			; N64-BE: # %bb.0: # %entry
	; N64-BE-NEXT: lui $1, %highest(struct_3xi16)			; N64-BE-NEXT: lui $1, %highest(struct_3xi16)
	; N64-BE-NEXT: daddiu $1, $1, %higher(struct_3xi16)			; N64-BE-NEXT: daddiu $1, $1, %higher(struct_3xi16)
	; N64-BE-NEXT: dsll $1, $1, 16			; N64-BE-NEXT: dsll $1, $1, 16
	; N64-BE-NEXT: daddiu $1, $1, %hi(struct_3xi16)			; N64-BE-NEXT: daddiu $1, $1, %hi(struct_3xi16)
	; N64-BE-NEXT: dsll $1, $1, 16			; N64-BE-NEXT: dsll $1, $1, 16
	; N64-BE-NEXT: lw $2, %lo(struct_3xi16)($1)			; N64-BE-NEXT: lw $2, %lo(struct_3xi16)($1)
	; N64-BE-NEXT: dsll $2, $2, 16			; N64-BE-NEXT: dsll $2, $2, 32
	; N64-BE-NEXT: daddiu $1, $1, %lo(struct_3xi16)			; N64-BE-NEXT: daddiu $1, $1, %lo(struct_3xi16)
	; N64-BE-NEXT: lhu $1, 4($1)			; N64-BE-NEXT: lhu $1, 4($1)
	; N64-BE-NEXT: or $1, $1, $2			; N64-BE-NEXT: dsll $1, $1, 16
	; N64-BE-NEXT: jr $ra			; N64-BE-NEXT: jr $ra
	; N64-BE-NEXT: dsll $2, $1, 16			; N64-BE-NEXT: or $2, $2, $1
	;			;
	; N64-LE-LABEL: ret_struct_3xi16:			; N64-LE-LABEL: ret_struct_3xi16:
	; N64-LE: # %bb.0: # %entry			; N64-LE: # %bb.0: # %entry
	; N64-LE-NEXT: lui $1, %highest(struct_3xi16)			; N64-LE-NEXT: lui $1, %highest(struct_3xi16)
	; N64-LE-NEXT: daddiu $1, $1, %higher(struct_3xi16)			; N64-LE-NEXT: daddiu $1, $1, %higher(struct_3xi16)
	; N64-LE-NEXT: dsll $1, $1, 16			; N64-LE-NEXT: dsll $1, $1, 16
	; N64-LE-NEXT: daddiu $1, $1, %hi(struct_3xi16)			; N64-LE-NEXT: daddiu $1, $1, %hi(struct_3xi16)
	; N64-LE-NEXT: dsll $1, $1, 16			; N64-LE-NEXT: dsll $1, $1, 16
	▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

llvm/test/CodeGen/Mips/cconv/vector.ll

	Show First 20 Lines • Show All 463 Lines • ▼ Show 20 Lines
	}			}

	define <4 x i8> @i8_4(<4 x i8> %a, <4 x i8> %b) {			define <4 x i8> @i8_4(<4 x i8> %a, <4 x i8> %b) {
	; MIPS32-LABEL: i8_4:			; MIPS32-LABEL: i8_4:
	; MIPS32: # %bb.0:			; MIPS32: # %bb.0:
	; MIPS32-NEXT: srl $1, $5, 24			; MIPS32-NEXT: srl $1, $5, 24
	; MIPS32-NEXT: srl $2, $4, 24			; MIPS32-NEXT: srl $2, $4, 24
	; MIPS32-NEXT: addu $1, $2, $1			; MIPS32-NEXT: addu $1, $2, $1
	; MIPS32-NEXT: sll $1, $1, 8
	; MIPS32-NEXT: srl $2, $5, 16
	; MIPS32-NEXT: srl $3, $4, 16
	; MIPS32-NEXT: addu $2, $3, $2
	; MIPS32-NEXT: andi $2, $2, 255
	; MIPS32-NEXT: or $1, $2, $1
	; MIPS32-NEXT: addu $2, $4, $5			; MIPS32-NEXT: addu $2, $4, $5
	; MIPS32-NEXT: sll $1, $1, 16			; MIPS32-NEXT: sll $1, $1, 24
				; MIPS32-NEXT: srl $3, $5, 16
				; MIPS32-NEXT: srl $6, $4, 16
				; MIPS32-NEXT: addu $3, $6, $3
				; MIPS32-NEXT: andi $3, $3, 255
				; MIPS32-NEXT: sll $3, $3, 16
				; MIPS32-NEXT: or $1, $1, $3
	; MIPS32-NEXT: andi $2, $2, 255			; MIPS32-NEXT: andi $2, $2, 255
	; MIPS32-NEXT: srl $3, $5, 8			; MIPS32-NEXT: srl $3, $5, 8
	; MIPS32-NEXT: srl $4, $4, 8			; MIPS32-NEXT: srl $4, $4, 8
	; MIPS32-NEXT: addu $3, $4, $3			; MIPS32-NEXT: addu $3, $4, $3
	; MIPS32-NEXT: sll $3, $3, 8			; MIPS32-NEXT: sll $3, $3, 8
	; MIPS32-NEXT: or $2, $2, $3			; MIPS32-NEXT: or $2, $2, $3
	; MIPS32-NEXT: andi $2, $2, 65535			; MIPS32-NEXT: andi $2, $2, 65535
	; MIPS32-NEXT: or $2, $2, $1			; MIPS32-NEXT: or $2, $2, $1
	; MIPS32-NEXT: jr $ra			; MIPS32-NEXT: jr $ra
	; MIPS32-NEXT: nop			; MIPS32-NEXT: nop
	;			;
	; MIPS64-LABEL: i8_4:			; MIPS64-LABEL: i8_4:
	; MIPS64: # %bb.0:			; MIPS64: # %bb.0:
	; MIPS64-NEXT: sll $1, $5, 0			; MIPS64-NEXT: sll $1, $5, 0
	; MIPS64-NEXT: srl $2, $1, 24			; MIPS64-NEXT: srl $2, $1, 24
	; MIPS64-NEXT: sll $3, $4, 0			; MIPS64-NEXT: sll $3, $4, 0
	; MIPS64-NEXT: srl $4, $3, 24			; MIPS64-NEXT: srl $4, $3, 24
	; MIPS64-NEXT: addu $2, $4, $2			; MIPS64-NEXT: addu $2, $4, $2
	; MIPS64-NEXT: sll $2, $2, 8
	; MIPS64-NEXT: srl $4, $1, 16
	; MIPS64-NEXT: srl $5, $3, 16
	; MIPS64-NEXT: addu $4, $5, $4
	; MIPS64-NEXT: andi $4, $4, 255
	; MIPS64-NEXT: or $2, $4, $2
	; MIPS64-NEXT: addu $4, $3, $1			; MIPS64-NEXT: addu $4, $3, $1
	; MIPS64-NEXT: sll $2, $2, 16			; MIPS64-NEXT: sll $2, $2, 24
				; MIPS64-NEXT: srl $5, $1, 16
				; MIPS64-NEXT: srl $6, $3, 16
				; MIPS64-NEXT: addu $5, $6, $5
				; MIPS64-NEXT: andi $5, $5, 255
				; MIPS64-NEXT: sll $5, $5, 16
				; MIPS64-NEXT: or $2, $2, $5
	; MIPS64-NEXT: andi $4, $4, 255			; MIPS64-NEXT: andi $4, $4, 255
	; MIPS64-NEXT: srl $1, $1, 8			; MIPS64-NEXT: srl $1, $1, 8
	; MIPS64-NEXT: srl $3, $3, 8			; MIPS64-NEXT: srl $3, $3, 8
	; MIPS64-NEXT: addu $1, $3, $1			; MIPS64-NEXT: addu $1, $3, $1
	; MIPS64-NEXT: sll $1, $1, 8			; MIPS64-NEXT: sll $1, $1, 8
	; MIPS64-NEXT: or $1, $4, $1			; MIPS64-NEXT: or $1, $4, $1
	; MIPS64-NEXT: andi $1, $1, 65535			; MIPS64-NEXT: andi $1, $1, 65535
	; MIPS64-NEXT: or $2, $1, $2			; MIPS64-NEXT: or $2, $1, $2
	▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	; MIPS64R5-NEXT: nop			; MIPS64R5-NEXT: nop
	%1 = add <4 x i8> %a, %b			%1 = add <4 x i8> %a, %b
	ret <4 x i8> %1			ret <4 x i8> %1
	}			}

	define <8 x i8> @i8_8(<8 x i8> %a, <8 x i8> %b) {			define <8 x i8> @i8_8(<8 x i8> %a, <8 x i8> %b) {
	; MIPS32-LABEL: i8_8:			; MIPS32-LABEL: i8_8:
	; MIPS32: # %bb.0:			; MIPS32: # %bb.0:
	; MIPS32-NEXT: srl $1, $6, 24			; MIPS32-NEXT: addu $1, $4, $6
	; MIPS32-NEXT: srl $2, $4, 24			; MIPS32-NEXT: srl $2, $6, 24
	; MIPS32-NEXT: addu $1, $2, $1			; MIPS32-NEXT: srl $3, $4, 24
	; MIPS32-NEXT: sll $1, $1, 8
	; MIPS32-NEXT: srl $2, $6, 16
	; MIPS32-NEXT: srl $3, $4, 16
	; MIPS32-NEXT: addu $2, $3, $2			; MIPS32-NEXT: addu $2, $3, $2
	; MIPS32-NEXT: andi $2, $2, 255			; MIPS32-NEXT: andi $1, $1, 255
	; MIPS32-NEXT: srl $3, $7, 24
	; MIPS32-NEXT: srl $8, $5, 24
	; MIPS32-NEXT: or $1, $2, $1
	; MIPS32-NEXT: addu $2, $8, $3
	; MIPS32-NEXT: addu $3, $4, $6
	; MIPS32-NEXT: sll $2, $2, 8
	; MIPS32-NEXT: srl $8, $7, 16
	; MIPS32-NEXT: srl $9, $5, 16
	; MIPS32-NEXT: addu $8, $9, $8
	; MIPS32-NEXT: andi $8, $8, 255
	; MIPS32-NEXT: or $8, $8, $2
	; MIPS32-NEXT: sll $1, $1, 16
	; MIPS32-NEXT: andi $2, $3, 255
	; MIPS32-NEXT: srl $3, $6, 8			; MIPS32-NEXT: srl $3, $6, 8
	; MIPS32-NEXT: srl $4, $4, 8			; MIPS32-NEXT: srl $8, $4, 8
	; MIPS32-NEXT: addu $3, $4, $3			; MIPS32-NEXT: addu $3, $8, $3
	; MIPS32-NEXT: sll $3, $3, 8			; MIPS32-NEXT: sll $3, $3, 8
	; MIPS32-NEXT: or $2, $2, $3			; MIPS32-NEXT: srl $6, $6, 16
	; MIPS32-NEXT: andi $2, $2, 65535			; MIPS32-NEXT: srl $4, $4, 16
	; MIPS32-NEXT: addu $3, $5, $7			; MIPS32-NEXT: or $1, $1, $3
	; MIPS32-NEXT: or $2, $2, $1			; MIPS32-NEXT: sll $2, $2, 24
	; MIPS32-NEXT: sll $1, $8, 16			; MIPS32-NEXT: addu $3, $4, $6
	; MIPS32-NEXT: andi $3, $3, 255			; MIPS32-NEXT: andi $3, $3, 255
				; MIPS32-NEXT: sll $3, $3, 16
				; MIPS32-NEXT: srl $4, $7, 24
				; MIPS32-NEXT: srl $6, $5, 24
				; MIPS32-NEXT: or $2, $2, $3
				; MIPS32-NEXT: andi $1, $1, 65535
				; MIPS32-NEXT: addu $3, $6, $4
				; MIPS32-NEXT: addu $4, $5, $7
				; MIPS32-NEXT: sll $3, $3, 24
				; MIPS32-NEXT: srl $6, $7, 16
				; MIPS32-NEXT: srl $8, $5, 16
				; MIPS32-NEXT: addu $6, $8, $6
				; MIPS32-NEXT: andi $6, $6, 255
				; MIPS32-NEXT: sll $6, $6, 16
				; MIPS32-NEXT: or $2, $1, $2
				; MIPS32-NEXT: or $1, $3, $6
				; MIPS32-NEXT: andi $3, $4, 255
	; MIPS32-NEXT: srl $4, $7, 8			; MIPS32-NEXT: srl $4, $7, 8
	; MIPS32-NEXT: srl $5, $5, 8			; MIPS32-NEXT: srl $5, $5, 8
	; MIPS32-NEXT: addu $4, $5, $4			; MIPS32-NEXT: addu $4, $5, $4
	; MIPS32-NEXT: sll $4, $4, 8			; MIPS32-NEXT: sll $4, $4, 8
	; MIPS32-NEXT: or $3, $3, $4			; MIPS32-NEXT: or $3, $3, $4
	; MIPS32-NEXT: andi $3, $3, 65535			; MIPS32-NEXT: andi $3, $3, 65535
	; MIPS32-NEXT: or $3, $3, $1			; MIPS32-NEXT: or $3, $3, $1
	; MIPS32-NEXT: jr $ra			; MIPS32-NEXT: jr $ra
	; MIPS32-NEXT: nop			; MIPS32-NEXT: nop
	;			;
	; MIPS64-LABEL: i8_8:			; MIPS64-LABEL: i8_8:
	; MIPS64: # %bb.0:			; MIPS64: # %bb.0:
	; MIPS64-NEXT: dsrl $1, $5, 56			; MIPS64-NEXT: dsrl $1, $5, 48
	; MIPS64-NEXT: sll $1, $1, 0			; MIPS64-NEXT: sll $1, $1, 0
	; MIPS64-NEXT: dsrl $2, $4, 56			; MIPS64-NEXT: dsrl $2, $4, 48
	; MIPS64-NEXT: sll $2, $2, 0			; MIPS64-NEXT: sll $2, $2, 0
	; MIPS64-NEXT: addu $1, $2, $1			; MIPS64-NEXT: addu $1, $2, $1
	; MIPS64-NEXT: dsrl $2, $5, 48			; MIPS64-NEXT: dsrl $2, $5, 56
	; MIPS64-NEXT: sll $1, $1, 8			; MIPS64-NEXT: andi $1, $1, 255
	; MIPS64-NEXT: sll $2, $2, 0			; MIPS64-NEXT: sll $2, $2, 0
	; MIPS64-NEXT: dsrl $3, $4, 48			; MIPS64-NEXT: dsrl $3, $4, 56
	; MIPS64-NEXT: sll $3, $3, 0			; MIPS64-NEXT: sll $3, $3, 0
	; MIPS64-NEXT: addu $2, $3, $2			; MIPS64-NEXT: addu $2, $3, $2
	; MIPS64-NEXT: andi $2, $2, 255
	; MIPS64-NEXT: dsrl $3, $5, 40			; MIPS64-NEXT: dsrl $3, $5, 40
	; MIPS64-NEXT: or $1, $2, $1			; MIPS64-NEXT: sll $2, $2, 24
	; MIPS64-NEXT: sll $2, $5, 0			; MIPS64-NEXT: sll $1, $1, 16
	; MIPS64-NEXT: sll $3, $3, 0			; MIPS64-NEXT: sll $3, $3, 0
	; MIPS64-NEXT: dsrl $6, $4, 40			; MIPS64-NEXT: dsrl $6, $4, 40
	; MIPS64-NEXT: sll $6, $6, 0			; MIPS64-NEXT: sll $6, $6, 0
	; MIPS64-NEXT: addu $3, $6, $3			; MIPS64-NEXT: addu $3, $6, $3
	; MIPS64-NEXT: dsrl $5, $5, 32			; MIPS64-NEXT: dsrl $6, $5, 32
	; MIPS64-NEXT: srl $6, $2, 24
	; MIPS64-NEXT: sll $7, $4, 0			; MIPS64-NEXT: sll $7, $4, 0
	; MIPS64-NEXT: srl $8, $7, 24
	; MIPS64-NEXT: addu $6, $8, $6
	; MIPS64-NEXT: sll $1, $1, 16
	; MIPS64-NEXT: sll $3, $3, 8
	; MIPS64-NEXT: sll $5, $5, 0			; MIPS64-NEXT: sll $5, $5, 0
				; MIPS64-NEXT: srl $8, $5, 24
				; MIPS64-NEXT: srl $9, $7, 24
				; MIPS64-NEXT: or $1, $2, $1
				; MIPS64-NEXT: sll $2, $3, 8
				; MIPS64-NEXT: sll $3, $6, 0
	; MIPS64-NEXT: dsrl $4, $4, 32			; MIPS64-NEXT: dsrl $4, $4, 32
	; MIPS64-NEXT: sll $4, $4, 0			; MIPS64-NEXT: sll $4, $4, 0
	; MIPS64-NEXT: addu $4, $4, $5			; MIPS64-NEXT: addu $3, $4, $3
	; MIPS64-NEXT: andi $4, $4, 255			; MIPS64-NEXT: andi $3, $3, 255
	; MIPS64-NEXT: or $3, $4, $3			; MIPS64-NEXT: or $2, $3, $2
	; MIPS64-NEXT: andi $3, $3, 65535			; MIPS64-NEXT: andi $2, $2, 65535
	; MIPS64-NEXT: or $1, $3, $1			; MIPS64-NEXT: or $1, $2, $1
	; MIPS64-NEXT: sll $3, $6, 8			; MIPS64-NEXT: addu $2, $9, $8
	; MIPS64-NEXT: srl $4, $2, 16			; MIPS64-NEXT: addu $3, $7, $5
	; MIPS64-NEXT: srl $5, $7, 16			; MIPS64-NEXT: sll $2, $2, 24
	; MIPS64-NEXT: addu $4, $5, $4			; MIPS64-NEXT: srl $4, $5, 16
				; MIPS64-NEXT: srl $6, $7, 16
				; MIPS64-NEXT: addu $4, $6, $4
	; MIPS64-NEXT: andi $4, $4, 255			; MIPS64-NEXT: andi $4, $4, 255
	; MIPS64-NEXT: or $3, $4, $3			; MIPS64-NEXT: sll $4, $4, 16
	; MIPS64-NEXT: addu $4, $7, $2
	; MIPS64-NEXT: dsll $1, $1, 32			; MIPS64-NEXT: dsll $1, $1, 32
	; MIPS64-NEXT: sll $3, $3, 16			; MIPS64-NEXT: or $2, $2, $4
	; MIPS64-NEXT: andi $4, $4, 255			; MIPS64-NEXT: andi $3, $3, 255
	; MIPS64-NEXT: srl $2, $2, 8			; MIPS64-NEXT: srl $4, $5, 8
	; MIPS64-NEXT: srl $5, $7, 8			; MIPS64-NEXT: srl $5, $7, 8
	; MIPS64-NEXT: addu $2, $5, $2			; MIPS64-NEXT: addu $4, $5, $4
	; MIPS64-NEXT: sll $2, $2, 8			; MIPS64-NEXT: sll $4, $4, 8
	; MIPS64-NEXT: or $2, $4, $2			; MIPS64-NEXT: or $3, $3, $4
	; MIPS64-NEXT: andi $2, $2, 65535			; MIPS64-NEXT: andi $3, $3, 65535
	; MIPS64-NEXT: or $2, $2, $3			; MIPS64-NEXT: or $2, $3, $2
	; MIPS64-NEXT: dsll $2, $2, 32			; MIPS64-NEXT: dsll $2, $2, 32
	; MIPS64-NEXT: dsrl $2, $2, 32			; MIPS64-NEXT: dsrl $2, $2, 32
	; MIPS64-NEXT: or $2, $2, $1			; MIPS64-NEXT: or $2, $2, $1
	; MIPS64-NEXT: jr $ra			; MIPS64-NEXT: jr $ra
	; MIPS64-NEXT: nop			; MIPS64-NEXT: nop
	;			;
	; MIPS32R5EB-LABEL: i8_8:			; MIPS32R5EB-LABEL: i8_8:
	; MIPS32R5EB: # %bb.0:			; MIPS32R5EB: # %bb.0:
	▲ Show 20 Lines • Show All 214 Lines • ▼ Show 20 Lines
	; MIPS32R5EL-NEXT: nop			; MIPS32R5EL-NEXT: nop
	%1 = add <8 x i8> %a, %b			%1 = add <8 x i8> %a, %b
	ret <8 x i8> %1			ret <8 x i8> %1
	}			}

	define <16 x i8> @i8_16(<16 x i8> %a, <16 x i8> %b) {			define <16 x i8> @i8_16(<16 x i8> %a, <16 x i8> %b) {
	; MIPS32-LABEL: i8_16:			; MIPS32-LABEL: i8_16:
	; MIPS32: # %bb.0:			; MIPS32: # %bb.0:
	; MIPS32-NEXT: lw $1, 24($sp)			; MIPS32-NEXT: lw $1, 16($sp)
	; MIPS32-NEXT: srl $2, $1, 24			; MIPS32-NEXT: lw $2, 20($sp)
	; MIPS32-NEXT: srl $3, $6, 24			; MIPS32-NEXT: lw $3, 24($sp)
	; MIPS32-NEXT: srl $8, $1, 16			; MIPS32-NEXT: srl $8, $3, 8
	; MIPS32-NEXT: srl $9, $6, 16			; MIPS32-NEXT: srl $9, $6, 8
	; MIPS32-NEXT: srl $10, $1, 8			; MIPS32-NEXT: srl $10, $2, 16
	; MIPS32-NEXT: srl $11, $6, 8			; MIPS32-NEXT: srl $11, $5, 16
	; MIPS32-NEXT: lw $12, 20($sp)			; MIPS32-NEXT: srl $12, $1, 16
	; MIPS32-NEXT: srl $13, $12, 8
	; MIPS32-NEXT: srl $14, $5, 8
	; MIPS32-NEXT: addu $13, $14, $13
	; MIPS32-NEXT: addu $14, $5, $12
	; MIPS32-NEXT: addu $10, $11, $10
	; MIPS32-NEXT: addu $1, $6, $1
	; MIPS32-NEXT: addu $6, $9, $8
	; MIPS32-NEXT: addu $2, $3, $2
	; MIPS32-NEXT: srl $3, $12, 24
	; MIPS32-NEXT: srl $8, $5, 24
	; MIPS32-NEXT: srl $9, $12, 16
	; MIPS32-NEXT: srl $5, $5, 16
	; MIPS32-NEXT: addu $5, $5, $9
	; MIPS32-NEXT: addu $3, $8, $3
	; MIPS32-NEXT: sll $2, $2, 8
	; MIPS32-NEXT: andi $6, $6, 255
	; MIPS32-NEXT: andi $1, $1, 255
	; MIPS32-NEXT: sll $8, $10, 8
	; MIPS32-NEXT: andi $9, $14, 255
	; MIPS32-NEXT: sll $10, $13, 8
	; MIPS32-NEXT: lw $11, 28($sp)
	; MIPS32-NEXT: lw $12, 16($sp)
	; MIPS32-NEXT: srl $13, $12, 24
	; MIPS32-NEXT: srl $14, $4, 24
	; MIPS32-NEXT: srl $15, $11, 24
	; MIPS32-NEXT: srl $24, $7, 24
	; MIPS32-NEXT: or $9, $9, $10
	; MIPS32-NEXT: or $1, $1, $8
	; MIPS32-NEXT: or $2, $6, $2
	; MIPS32-NEXT: addu $6, $24, $15
	; MIPS32-NEXT: sll $3, $3, 8
	; MIPS32-NEXT: andi $5, $5, 255
	; MIPS32-NEXT: addu $8, $14, $13
	; MIPS32-NEXT: sll $8, $8, 8
	; MIPS32-NEXT: srl $10, $12, 16
	; MIPS32-NEXT: srl $13, $4, 16			; MIPS32-NEXT: srl $13, $4, 16
	; MIPS32-NEXT: addu $10, $13, $10			; MIPS32-NEXT: srl $14, $1, 8
	; MIPS32-NEXT: andi $10, $10, 255			; MIPS32-NEXT: srl $15, $4, 8
	; MIPS32-NEXT: or $8, $10, $8			; MIPS32-NEXT: addu $24, $6, $3
	; MIPS32-NEXT: or $3, $5, $3			; MIPS32-NEXT: addu $14, $15, $14
	; MIPS32-NEXT: addu $5, $4, $12			; MIPS32-NEXT: addu $15, $4, $1
	; MIPS32-NEXT: sll $6, $6, 8			; MIPS32-NEXT: addu $12, $13, $12
	; MIPS32-NEXT: srl $10, $11, 16			; MIPS32-NEXT: addu $10, $11, $10
	; MIPS32-NEXT: srl $13, $7, 16			; MIPS32-NEXT: srl $11, $2, 24
	; MIPS32-NEXT: addu $10, $13, $10			; MIPS32-NEXT: addu $13, $5, $2
				; MIPS32-NEXT: addu $8, $9, $8
				; MIPS32-NEXT: srl $1, $1, 24
				; MIPS32-NEXT: srl $4, $4, 24
				; MIPS32-NEXT: srl $9, $5, 24
				; MIPS32-NEXT: srl $25, $3, 24
				; MIPS32-NEXT: srl $gp, $6, 24
				; MIPS32-NEXT: addu $25, $gp, $25
	; MIPS32-NEXT: andi $10, $10, 255			; MIPS32-NEXT: andi $10, $10, 255
	; MIPS32-NEXT: or $6, $10, $6			; MIPS32-NEXT: addu $9, $9, $11
	; MIPS32-NEXT: sll $10, $2, 16			; MIPS32-NEXT: andi $11, $12, 255
	; MIPS32-NEXT: andi $1, $1, 65535			; MIPS32-NEXT: addu $1, $4, $1
				; MIPS32-NEXT: andi $4, $15, 255
				; MIPS32-NEXT: sll $12, $14, 8
				; MIPS32-NEXT: andi $14, $24, 255
				; MIPS32-NEXT: sll $8, $8, 8
				; MIPS32-NEXT: andi $13, $13, 255
				; MIPS32-NEXT: srl $2, $2, 8
				; MIPS32-NEXT: srl $5, $5, 8
				; MIPS32-NEXT: addu $2, $5, $2
				; MIPS32-NEXT: sll $2, $2, 8
				; MIPS32-NEXT: srl $3, $3, 16
				; MIPS32-NEXT: srl $5, $6, 16
				; MIPS32-NEXT: or $2, $13, $2
				; MIPS32-NEXT: or $6, $14, $8
				; MIPS32-NEXT: or $4, $4, $12
				; MIPS32-NEXT: sll $1, $1, 24
				; MIPS32-NEXT: sll $8, $11, 16
				; MIPS32-NEXT: sll $9, $9, 24
				; MIPS32-NEXT: sll $10, $10, 16
				; MIPS32-NEXT: sll $11, $25, 24
				; MIPS32-NEXT: addu $3, $5, $3
				; MIPS32-NEXT: andi $3, $3, 255
	; MIPS32-NEXT: sll $3, $3, 16			; MIPS32-NEXT: sll $3, $3, 16
	; MIPS32-NEXT: andi $9, $9, 65535			; MIPS32-NEXT: lw $5, 28($sp)
	; MIPS32-NEXT: sll $2, $8, 16			; MIPS32-NEXT: srl $12, $5, 24
	; MIPS32-NEXT: andi $5, $5, 255			; MIPS32-NEXT: srl $13, $7, 24
	; MIPS32-NEXT: srl $8, $12, 8			; MIPS32-NEXT: or $11, $11, $3
	; MIPS32-NEXT: srl $4, $4, 8			; MIPS32-NEXT: or $3, $9, $10
	; MIPS32-NEXT: addu $4, $4, $8			; MIPS32-NEXT: or $1, $1, $8
	; MIPS32-NEXT: sll $4, $4, 8
	; MIPS32-NEXT: or $4, $5, $4
	; MIPS32-NEXT: andi $4, $4, 65535			; MIPS32-NEXT: andi $4, $4, 65535
	; MIPS32-NEXT: addu $5, $7, $11			; MIPS32-NEXT: addu $8, $13, $12
	; MIPS32-NEXT: or $2, $4, $2			; MIPS32-NEXT: andi $6, $6, 65535
				; MIPS32-NEXT: andi $9, $2, 65535
				; MIPS32-NEXT: addu $10, $7, $5
				; MIPS32-NEXT: sll $8, $8, 24
				; MIPS32-NEXT: srl $2, $5, 16
				; MIPS32-NEXT: srl $12, $7, 16
				; MIPS32-NEXT: addu $2, $12, $2
				; MIPS32-NEXT: andi $2, $2, 255
				; MIPS32-NEXT: sll $12, $2, 16
				; MIPS32-NEXT: or $2, $4, $1
	; MIPS32-NEXT: or $3, $9, $3			; MIPS32-NEXT: or $3, $9, $3
	; MIPS32-NEXT: or $4, $1, $10			; MIPS32-NEXT: or $4, $6, $11
	; MIPS32-NEXT: sll $1, $6, 16			; MIPS32-NEXT: or $1, $8, $12
	; MIPS32-NEXT: andi $5, $5, 255			; MIPS32-NEXT: andi $6, $10, 255
	; MIPS32-NEXT: srl $6, $11, 8			; MIPS32-NEXT: srl $5, $5, 8
	; MIPS32-NEXT: srl $7, $7, 8			; MIPS32-NEXT: srl $7, $7, 8
	; MIPS32-NEXT: addu $6, $7, $6			; MIPS32-NEXT: addu $5, $7, $5
	; MIPS32-NEXT: sll $6, $6, 8			; MIPS32-NEXT: sll $5, $5, 8
	; MIPS32-NEXT: or $5, $5, $6			; MIPS32-NEXT: or $5, $6, $5
	; MIPS32-NEXT: andi $5, $5, 65535			; MIPS32-NEXT: andi $5, $5, 65535
	; MIPS32-NEXT: or $5, $5, $1			; MIPS32-NEXT: or $5, $5, $1
	; MIPS32-NEXT: jr $ra			; MIPS32-NEXT: jr $ra
	; MIPS32-NEXT: nop			; MIPS32-NEXT: nop
	;			;
	; MIPS64-LABEL: i8_16:			; MIPS64-LABEL: i8_16:
	; MIPS64: # %bb.0:			; MIPS64: # %bb.0:
	; MIPS64-NEXT: dsrl $1, $7, 56			; MIPS64-NEXT: sll $1, $6, 0
	; MIPS64-NEXT: dsrl $2, $5, 56			; MIPS64-NEXT: dsrl $2, $6, 56
	; MIPS64-NEXT: dsrl $3, $7, 48			; MIPS64-NEXT: dsrl $3, $6, 48
	; MIPS64-NEXT: dsrl $8, $5, 48			; MIPS64-NEXT: dsrl $8, $4, 48
	; MIPS64-NEXT: dsrl $9, $6, 56			; MIPS64-NEXT: srl $9, $1, 16
	; MIPS64-NEXT: dsrl $10, $4, 56			; MIPS64-NEXT: sll $10, $4, 0
	; MIPS64-NEXT: dsrl $11, $7, 32			; MIPS64-NEXT: srl $11, $10, 16
	; MIPS64-NEXT: sll $1, $1, 0			; MIPS64-NEXT: dsrl $12, $7, 56
				; MIPS64-NEXT: addu $13, $10, $1
				; MIPS64-NEXT: addu $9, $11, $9
	; MIPS64-NEXT: sll $2, $2, 0			; MIPS64-NEXT: sll $2, $2, 0
				; MIPS64-NEXT: dsrl $11, $7, 48
				; MIPS64-NEXT: srl $14, $1, 8
				; MIPS64-NEXT: srl $15, $10, 8
				; MIPS64-NEXT: addu $14, $15, $14
				; MIPS64-NEXT: dsrl $15, $4, 56
				; MIPS64-NEXT: dsrl $24, $7, 40
	; MIPS64-NEXT: sll $3, $3, 0			; MIPS64-NEXT: sll $3, $3, 0
	; MIPS64-NEXT: sll $8, $8, 0			; MIPS64-NEXT: sll $8, $8, 0
	; MIPS64-NEXT: dsrl $12, $7, 40			; MIPS64-NEXT: sll $15, $15, 0
	; MIPS64-NEXT: sll $12, $12, 0
	; MIPS64-NEXT: dsrl $13, $5, 40
	; MIPS64-NEXT: sll $13, $13, 0
	; MIPS64-NEXT: addu $12, $13, $12
	; MIPS64-NEXT: addu $3, $8, $3
	; MIPS64-NEXT: addu $1, $2, $1
	; MIPS64-NEXT: sll $2, $9, 0
	; MIPS64-NEXT: sll $8, $10, 0
	; MIPS64-NEXT: dsrl $9, $6, 48
	; MIPS64-NEXT: sll $9, $9, 0
	; MIPS64-NEXT: dsrl $10, $4, 48
	; MIPS64-NEXT: sll $10, $10, 0
	; MIPS64-NEXT: addu $9, $10, $9
	; MIPS64-NEXT: addu $2, $8, $2
	; MIPS64-NEXT: sll $8, $1, 8
	; MIPS64-NEXT: andi $3, $3, 255
	; MIPS64-NEXT: sll $1, $12, 8
	; MIPS64-NEXT: sll $10, $11, 0
	; MIPS64-NEXT: dsrl $11, $5, 32
	; MIPS64-NEXT: sll $11, $11, 0
	; MIPS64-NEXT: addu $10, $11, $10
	; MIPS64-NEXT: andi $10, $10, 255
	; MIPS64-NEXT: or $10, $10, $1
	; MIPS64-NEXT: sll $1, $6, 0
	; MIPS64-NEXT: or $8, $3, $8
	; MIPS64-NEXT: sll $2, $2, 8
	; MIPS64-NEXT: andi $9, $9, 255			; MIPS64-NEXT: andi $9, $9, 255
	; MIPS64-NEXT: dsrl $11, $6, 40			; MIPS64-NEXT: addu $2, $15, $2
	; MIPS64-NEXT: srl $3, $1, 24			; MIPS64-NEXT: andi $13, $13, 255
	; MIPS64-NEXT: sll $12, $4, 0			; MIPS64-NEXT: sll $14, $14, 8
	; MIPS64-NEXT: srl $13, $12, 24			; MIPS64-NEXT: addu $3, $8, $3
	; MIPS64-NEXT: srl $14, $1, 16
	; MIPS64-NEXT: srl $15, $12, 16
	; MIPS64-NEXT: andi $10, $10, 65535
	; MIPS64-NEXT: addu $14, $15, $14
	; MIPS64-NEXT: addu $13, $13, $3
	; MIPS64-NEXT: sll $3, $7, 0
	; MIPS64-NEXT: or $2, $9, $2
	; MIPS64-NEXT: sll $7, $8, 16
	; MIPS64-NEXT: sll $8, $11, 0			; MIPS64-NEXT: sll $8, $11, 0
	; MIPS64-NEXT: dsrl $9, $4, 40			; MIPS64-NEXT: srl $1, $1, 24
	; MIPS64-NEXT: sll $9, $9, 0			; MIPS64-NEXT: sll $11, $12, 0
	; MIPS64-NEXT: addu $8, $9, $8			; MIPS64-NEXT: dsrl $12, $5, 56
				; MIPS64-NEXT: dsrl $15, $5, 48
				; MIPS64-NEXT: andi $3, $3, 255
				; MIPS64-NEXT: dsrl $25, $6, 40
				; MIPS64-NEXT: sll $15, $15, 0
				; MIPS64-NEXT: srl $10, $10, 24
				; MIPS64-NEXT: sll $12, $12, 0
				; MIPS64-NEXT: or $13, $13, $14
				; MIPS64-NEXT: sll $14, $24, 0
				; MIPS64-NEXT: sll $2, $2, 24
				; MIPS64-NEXT: addu $11, $12, $11
				; MIPS64-NEXT: sll $9, $9, 16
				; MIPS64-NEXT: addu $1, $10, $1
				; MIPS64-NEXT: addu $8, $15, $8
				; MIPS64-NEXT: sll $10, $25, 0
				; MIPS64-NEXT: dsrl $12, $4, 40
				; MIPS64-NEXT: sll $12, $12, 0
				; MIPS64-NEXT: addu $10, $12, $10
				; MIPS64-NEXT: sll $3, $3, 16
				; MIPS64-NEXT: andi $8, $8, 255
				; MIPS64-NEXT: sll $1, $1, 24
				; MIPS64-NEXT: dsrl $12, $5, 40
				; MIPS64-NEXT: sll $12, $12, 0
	; MIPS64-NEXT: dsrl $6, $6, 32			; MIPS64-NEXT: dsrl $6, $6, 32
	; MIPS64-NEXT: srl $9, $3, 24			; MIPS64-NEXT: or $1, $1, $9
	; MIPS64-NEXT: sll $5, $5, 0			; MIPS64-NEXT: addu $9, $12, $14
	; MIPS64-NEXT: srl $11, $5, 24			; MIPS64-NEXT: sll $11, $11, 24
	; MIPS64-NEXT: or $7, $10, $7			; MIPS64-NEXT: sll $8, $8, 16
	; MIPS64-NEXT: addu $9, $11, $9			; MIPS64-NEXT: dsrl $12, $7, 32
	; MIPS64-NEXT: sll $10, $13, 8			; MIPS64-NEXT: andi $13, $13, 65535
	; MIPS64-NEXT: andi $11, $14, 255			; MIPS64-NEXT: or $2, $2, $3
	; MIPS64-NEXT: sll $2, $2, 16			; MIPS64-NEXT: sll $3, $10, 8
	; MIPS64-NEXT: sll $8, $8, 8
	; MIPS64-NEXT: sll $6, $6, 0			; MIPS64-NEXT: sll $6, $6, 0
	; MIPS64-NEXT: dsrl $4, $4, 32			; MIPS64-NEXT: dsrl $4, $4, 32
	; MIPS64-NEXT: sll $4, $4, 0			; MIPS64-NEXT: sll $4, $4, 0
	; MIPS64-NEXT: addu $4, $4, $6			; MIPS64-NEXT: addu $4, $4, $6
	; MIPS64-NEXT: andi $4, $4, 255			; MIPS64-NEXT: andi $4, $4, 255
	; MIPS64-NEXT: or $4, $4, $8			; MIPS64-NEXT: or $3, $4, $3
	; MIPS64-NEXT: andi $4, $4, 65535			; MIPS64-NEXT: andi $3, $3, 65535
	; MIPS64-NEXT: or $2, $4, $2			; MIPS64-NEXT: or $2, $3, $2
	; MIPS64-NEXT: or $4, $11, $10			; MIPS64-NEXT: or $1, $13, $1
	; MIPS64-NEXT: addu $6, $12, $1			; MIPS64-NEXT: or $3, $11, $8
	; MIPS64-NEXT: sll $8, $9, 8			; MIPS64-NEXT: sll $4, $9, 8
	; MIPS64-NEXT: srl $9, $3, 16			; MIPS64-NEXT: sll $6, $12, 0
	; MIPS64-NEXT: srl $10, $5, 16			; MIPS64-NEXT: dsrl $8, $5, 32
	; MIPS64-NEXT: addu $9, $10, $9			; MIPS64-NEXT: sll $8, $8, 0
	; MIPS64-NEXT: andi $9, $9, 255			; MIPS64-NEXT: addu $6, $8, $6
	; MIPS64-NEXT: or $8, $9, $8
	; MIPS64-NEXT: addu $9, $5, $3
	; MIPS64-NEXT: dsll $2, $2, 32
	; MIPS64-NEXT: sll $4, $4, 16
	; MIPS64-NEXT: andi $6, $6, 255			; MIPS64-NEXT: andi $6, $6, 255
	; MIPS64-NEXT: srl $1, $1, 8			; MIPS64-NEXT: or $4, $6, $4
	; MIPS64-NEXT: srl $10, $12, 8			; MIPS64-NEXT: andi $4, $4, 65535
	; MIPS64-NEXT: addu $1, $10, $1
	; MIPS64-NEXT: sll $1, $1, 8
	; MIPS64-NEXT: or $1, $6, $1
	; MIPS64-NEXT: andi $1, $1, 65535
	; MIPS64-NEXT: or $1, $1, $4
	; MIPS64-NEXT: dsll $1, $1, 32			; MIPS64-NEXT: dsll $1, $1, 32
				; MIPS64-NEXT: or $3, $4, $3
				; MIPS64-NEXT: sll $4, $7, 0
				; MIPS64-NEXT: srl $6, $4, 24
				; MIPS64-NEXT: sll $5, $5, 0
				; MIPS64-NEXT: srl $7, $5, 24
				; MIPS64-NEXT: addu $8, $5, $4
				; MIPS64-NEXT: dsll $2, $2, 32
	; MIPS64-NEXT: dsrl $1, $1, 32			; MIPS64-NEXT: dsrl $1, $1, 32
				; MIPS64-NEXT: addu $6, $7, $6
				; MIPS64-NEXT: sll $6, $6, 24
				; MIPS64-NEXT: srl $7, $4, 16
				; MIPS64-NEXT: srl $9, $5, 16
				; MIPS64-NEXT: addu $7, $9, $7
				; MIPS64-NEXT: andi $7, $7, 255
				; MIPS64-NEXT: sll $7, $7, 16
	; MIPS64-NEXT: or $2, $1, $2			; MIPS64-NEXT: or $2, $1, $2
	; MIPS64-NEXT: dsll $1, $7, 32			; MIPS64-NEXT: dsll $1, $3, 32
	; MIPS64-NEXT: sll $4, $8, 16			; MIPS64-NEXT: or $3, $6, $7
	; MIPS64-NEXT: andi $6, $9, 255			; MIPS64-NEXT: andi $6, $8, 255
	; MIPS64-NEXT: srl $3, $3, 8			; MIPS64-NEXT: srl $4, $4, 8
	; MIPS64-NEXT: srl $5, $5, 8			; MIPS64-NEXT: srl $5, $5, 8
	; MIPS64-NEXT: addu $3, $5, $3			; MIPS64-NEXT: addu $4, $5, $4
	; MIPS64-NEXT: sll $3, $3, 8			; MIPS64-NEXT: sll $4, $4, 8
	; MIPS64-NEXT: or $3, $6, $3			; MIPS64-NEXT: or $4, $6, $4
	; MIPS64-NEXT: andi $3, $3, 65535			; MIPS64-NEXT: andi $4, $4, 65535
	; MIPS64-NEXT: or $3, $3, $4			; MIPS64-NEXT: or $3, $4, $3
	; MIPS64-NEXT: dsll $3, $3, 32			; MIPS64-NEXT: dsll $3, $3, 32
	; MIPS64-NEXT: dsrl $3, $3, 32			; MIPS64-NEXT: dsrl $3, $3, 32
	; MIPS64-NEXT: or $3, $3, $1			; MIPS64-NEXT: or $3, $3, $1
	; MIPS64-NEXT: jr $ra			; MIPS64-NEXT: jr $ra
	; MIPS64-NEXT: nop			; MIPS64-NEXT: nop
	;			;
	; MIPS32R5EB-LABEL: i8_16:			; MIPS32R5EB-LABEL: i8_16:
	; MIPS32R5EB: # %bb.0:			; MIPS32R5EB: # %bb.0:
	▲ Show 20 Lines • Show All 5,495 Lines • ▼ Show 20 Lines
	; MIPS32R5EB-NEXT: addiu $sp, $sp, 48			; MIPS32R5EB-NEXT: addiu $sp, $sp, 48
	; MIPS32R5EB-NEXT: jr $ra			; MIPS32R5EB-NEXT: jr $ra
	; MIPS32R5EB-NEXT: nop			; MIPS32R5EB-NEXT: nop
	;			;
	; MIPS64R5EB-LABEL: i24x2:			; MIPS64R5EB-LABEL: i24x2:
	; MIPS64R5EB: # %bb.0: # %Entry			; MIPS64R5EB: # %bb.0: # %Entry
	; MIPS64R5EB-NEXT: daddiu $sp, $sp, -32			; MIPS64R5EB-NEXT: daddiu $sp, $sp, -32
	; MIPS64R5EB-NEXT: .cfi_def_cfa_offset 32			; MIPS64R5EB-NEXT: .cfi_def_cfa_offset 32
				; MIPS64R5EB-NEXT: sh $5, 20($sp)
	; MIPS64R5EB-NEXT: dsrl $1, $5, 16			; MIPS64R5EB-NEXT: dsrl $1, $5, 16
	; MIPS64R5EB-NEXT: sw $1, 16($sp)			; MIPS64R5EB-NEXT: sw $1, 16($sp)
	; MIPS64R5EB-NEXT: sh $5, 20($sp)			; MIPS64R5EB-NEXT: sh $4, 28($sp)
	; MIPS64R5EB-NEXT: dsrl $1, $4, 16			; MIPS64R5EB-NEXT: dsrl $1, $4, 16
	; MIPS64R5EB-NEXT: sw $1, 24($sp)			; MIPS64R5EB-NEXT: sw $1, 24($sp)
	; MIPS64R5EB-NEXT: sh $4, 28($sp)			; MIPS64R5EB-NEXT: lbu $1, 20($sp)
	; MIPS64R5EB-NEXT: lb $1, 19($sp)
	; MIPS64R5EB-NEXT: dsll $1, $1, 8
	; MIPS64R5EB-NEXT: lbu $2, 20($sp)
	; MIPS64R5EB-NEXT: or $1, $1, $2
	; MIPS64R5EB-NEXT: dsll $1, $1, 8			; MIPS64R5EB-NEXT: dsll $1, $1, 8
	; MIPS64R5EB-NEXT: lb $2, 27($sp)			; MIPS64R5EB-NEXT: lb $2, 19($sp)
				; MIPS64R5EB-NEXT: dsll $2, $2, 16
				; MIPS64R5EB-NEXT: or $1, $2, $1
				; MIPS64R5EB-NEXT: lbu $2, 28($sp)
	; MIPS64R5EB-NEXT: dsll $2, $2, 8			; MIPS64R5EB-NEXT: dsll $2, $2, 8
	; MIPS64R5EB-NEXT: lbu $3, 28($sp)			; MIPS64R5EB-NEXT: lb $3, 27($sp)
	; MIPS64R5EB-NEXT: or $2, $2, $3			; MIPS64R5EB-NEXT: dsll $3, $3, 16
	; MIPS64R5EB-NEXT: lbu $3, 21($sp)			; MIPS64R5EB-NEXT: lbu $4, 21($sp)
	; MIPS64R5EB-NEXT: dsll $2, $2, 8			; MIPS64R5EB-NEXT: or $2, $3, $2
	; MIPS64R5EB-NEXT: or $1, $3, $1			; MIPS64R5EB-NEXT: or $1, $4, $1
	; MIPS64R5EB-NEXT: lh $3, 16($sp)			; MIPS64R5EB-NEXT: lh $3, 16($sp)
	; MIPS64R5EB-NEXT: dsll $3, $3, 8			; MIPS64R5EB-NEXT: dsll $3, $3, 8
	; MIPS64R5EB-NEXT: lbu $4, 18($sp)			; MIPS64R5EB-NEXT: lbu $4, 18($sp)
	; MIPS64R5EB-NEXT: or $3, $4, $3			; MIPS64R5EB-NEXT: or $3, $4, $3
	; MIPS64R5EB-NEXT: lbu $4, 29($sp)			; MIPS64R5EB-NEXT: lbu $4, 29($sp)
	; MIPS64R5EB-NEXT: insert.d $w0[0], $3			; MIPS64R5EB-NEXT: insert.d $w0[0], $3
	; MIPS64R5EB-NEXT: insert.d $w0[1], $1			; MIPS64R5EB-NEXT: insert.d $w0[1], $1
	; MIPS64R5EB-NEXT: or $1, $4, $2			; MIPS64R5EB-NEXT: or $1, $4, $2
	▲ Show 20 Lines • Show All 365 Lines • Show Last 20 Lines

llvm/test/CodeGen/Mips/load-store-left-right.ll

	Show First 20 Lines • Show All 971 Lines • ▼ Show 20 Lines
	; MIPS32-EB: # %bb.0: # %entry			; MIPS32-EB: # %bb.0: # %entry
	; MIPS32-EB-NEXT: lui $2, %hi(_gp_disp)			; MIPS32-EB-NEXT: lui $2, %hi(_gp_disp)
	; MIPS32-EB-NEXT: addiu $2, $2, %lo(_gp_disp)			; MIPS32-EB-NEXT: addiu $2, $2, %lo(_gp_disp)
	; MIPS32-EB-NEXT: addiu $sp, $sp, -24			; MIPS32-EB-NEXT: addiu $sp, $sp, -24
	; MIPS32-EB-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill			; MIPS32-EB-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
	; MIPS32-EB-NEXT: addu $gp, $2, $25			; MIPS32-EB-NEXT: addu $gp, $2, $25
	; MIPS32-EB-NEXT: lw $1, %got(arr)($gp)			; MIPS32-EB-NEXT: lw $1, %got(arr)($gp)
	; MIPS32-EB-NEXT: lwl $4, 0($1)			; MIPS32-EB-NEXT: lwl $4, 0($1)
	; MIPS32-EB-NEXT: lwr $4, 3($1)
	; MIPS32-EB-NEXT: lbu $2, 5($1)			; MIPS32-EB-NEXT: lbu $2, 5($1)
				; MIPS32-EB-NEXT: lwr $4, 3($1)
				; MIPS32-EB-NEXT: sll $2, $2, 16
	; MIPS32-EB-NEXT: lbu $3, 4($1)			; MIPS32-EB-NEXT: lbu $3, 4($1)
	; MIPS32-EB-NEXT: sll $3, $3, 8			; MIPS32-EB-NEXT: sll $3, $3, 24
	; MIPS32-EB-NEXT: or $2, $3, $2			; MIPS32-EB-NEXT: or $2, $3, $2
	; MIPS32-EB-NEXT: sll $2, $2, 16
	; MIPS32-EB-NEXT: lbu $1, 6($1)			; MIPS32-EB-NEXT: lbu $1, 6($1)
	; MIPS32-EB-NEXT: sll $1, $1, 8			; MIPS32-EB-NEXT: sll $1, $1, 8
	; MIPS32-EB-NEXT: lw $25, %call16(extern_func)($gp)			; MIPS32-EB-NEXT: lw $25, %call16(extern_func)($gp)
	; MIPS32-EB-NEXT: .reloc ($tmp0), R_MIPS_JALR, extern_func			; MIPS32-EB-NEXT: .reloc ($tmp0), R_MIPS_JALR, extern_func
	; MIPS32-EB-NEXT: $tmp0:			; MIPS32-EB-NEXT: $tmp0:
	; MIPS32-EB-NEXT: jalr $25			; MIPS32-EB-NEXT: jalr $25
	; MIPS32-EB-NEXT: or $5, $2, $1			; MIPS32-EB-NEXT: or $5, $2, $1
	; MIPS32-EB-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload			; MIPS32-EB-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
	; MIPS64-EL: # %bb.0: # %entry			; MIPS64-EL: # %bb.0: # %entry
	; MIPS64-EL-NEXT: daddiu $sp, $sp, -16			; MIPS64-EL-NEXT: daddiu $sp, $sp, -16
	; MIPS64-EL-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill			; MIPS64-EL-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
	; MIPS64-EL-NEXT: sd $gp, 0($sp) # 8-byte Folded Spill			; MIPS64-EL-NEXT: sd $gp, 0($sp) # 8-byte Folded Spill
	; MIPS64-EL-NEXT: lui $1, %hi(%neg(%gp_rel(pass_array_byval)))			; MIPS64-EL-NEXT: lui $1, %hi(%neg(%gp_rel(pass_array_byval)))
	; MIPS64-EL-NEXT: daddu $1, $1, $25			; MIPS64-EL-NEXT: daddu $1, $1, $25
	; MIPS64-EL-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(pass_array_byval)))			; MIPS64-EL-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(pass_array_byval)))
	; MIPS64-EL-NEXT: ld $1, %got_disp(arr)($gp)			; MIPS64-EL-NEXT: ld $1, %got_disp(arr)($gp)
	; MIPS64-EL-NEXT: lwl $2, 3($1)			; MIPS64-EL-NEXT: lbu $2, 4($1)
	; MIPS64-EL-NEXT: lwr $2, 0($1)			; MIPS64-EL-NEXT: dsll $2, $2, 32
	; MIPS64-EL-NEXT: daddiu $3, $zero, 1			; MIPS64-EL-NEXT: lbu $3, 5($1)
	; MIPS64-EL-NEXT: dsll $3, $3, 32			; MIPS64-EL-NEXT: dsll $3, $3, 40
	; MIPS64-EL-NEXT: daddiu $3, $3, -1			; MIPS64-EL-NEXT: or $2, $3, $2
	; MIPS64-EL-NEXT: and $2, $2, $3			; MIPS64-EL-NEXT: lwl $3, 3($1)
	; MIPS64-EL-NEXT: lbu $3, 4($1)			; MIPS64-EL-NEXT: lwr $3, 0($1)
	; MIPS64-EL-NEXT: lbu $4, 5($1)			; MIPS64-EL-NEXT: daddiu $4, $zero, 1
	; MIPS64-EL-NEXT: dsll $4, $4, 8			; MIPS64-EL-NEXT: dsll $4, $4, 32
	; MIPS64-EL-NEXT: or $3, $4, $3			; MIPS64-EL-NEXT: daddiu $4, $4, -1
	; MIPS64-EL-NEXT: dsll $3, $3, 32			; MIPS64-EL-NEXT: and $3, $3, $4
	; MIPS64-EL-NEXT: or $2, $2, $3			; MIPS64-EL-NEXT: or $2, $3, $2
	; MIPS64-EL-NEXT: lbu $1, 6($1)			; MIPS64-EL-NEXT: lbu $1, 6($1)
	; MIPS64-EL-NEXT: dsll $1, $1, 48			; MIPS64-EL-NEXT: dsll $1, $1, 48
	; MIPS64-EL-NEXT: ld $25, %call16(extern_func)($gp)			; MIPS64-EL-NEXT: ld $25, %call16(extern_func)($gp)
	; MIPS64-EL-NEXT: .reloc .Ltmp0, R_MIPS_JALR, extern_func			; MIPS64-EL-NEXT: .reloc .Ltmp0, R_MIPS_JALR, extern_func
	; MIPS64-EL-NEXT: .Ltmp0:			; MIPS64-EL-NEXT: .Ltmp0:
	; MIPS64-EL-NEXT: jalr $25			; MIPS64-EL-NEXT: jalr $25
	; MIPS64-EL-NEXT: or $4, $2, $1			; MIPS64-EL-NEXT: or $4, $2, $1
	; MIPS64-EL-NEXT: ld $gp, 0($sp) # 8-byte Folded Reload			; MIPS64-EL-NEXT: ld $gp, 0($sp) # 8-byte Folded Reload
	; MIPS64-EL-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload			; MIPS64-EL-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
	; MIPS64-EL-NEXT: jr $ra			; MIPS64-EL-NEXT: jr $ra
	; MIPS64-EL-NEXT: daddiu $sp, $sp, 16			; MIPS64-EL-NEXT: daddiu $sp, $sp, 16
	;			;
	; MIPS64-EB-LABEL: pass_array_byval:			; MIPS64-EB-LABEL: pass_array_byval:
	; MIPS64-EB: # %bb.0: # %entry			; MIPS64-EB: # %bb.0: # %entry
	; MIPS64-EB-NEXT: daddiu $sp, $sp, -16			; MIPS64-EB-NEXT: daddiu $sp, $sp, -16
	; MIPS64-EB-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill			; MIPS64-EB-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
	; MIPS64-EB-NEXT: sd $gp, 0($sp) # 8-byte Folded Spill			; MIPS64-EB-NEXT: sd $gp, 0($sp) # 8-byte Folded Spill
	; MIPS64-EB-NEXT: lui $1, %hi(%neg(%gp_rel(pass_array_byval)))			; MIPS64-EB-NEXT: lui $1, %hi(%neg(%gp_rel(pass_array_byval)))
	; MIPS64-EB-NEXT: daddu $1, $1, $25			; MIPS64-EB-NEXT: daddu $1, $1, $25
	; MIPS64-EB-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(pass_array_byval)))			; MIPS64-EB-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(pass_array_byval)))
	; MIPS64-EB-NEXT: ld $1, %got_disp(arr)($gp)			; MIPS64-EB-NEXT: ld $1, %got_disp(arr)($gp)
	; MIPS64-EB-NEXT: lwl $2, 0($1)			; MIPS64-EB-NEXT: lbu $2, 5($1)
	; MIPS64-EB-NEXT: lwr $2, 3($1)			; MIPS64-EB-NEXT: dsll $2, $2, 16
	; MIPS64-EB-NEXT: dsll $2, $2, 32			; MIPS64-EB-NEXT: lbu $3, 4($1)
	; MIPS64-EB-NEXT: lbu $3, 5($1)			; MIPS64-EB-NEXT: dsll $3, $3, 24
	; MIPS64-EB-NEXT: lbu $4, 4($1)			; MIPS64-EB-NEXT: or $2, $3, $2
	; MIPS64-EB-NEXT: dsll $4, $4, 8			; MIPS64-EB-NEXT: lwl $3, 0($1)
	; MIPS64-EB-NEXT: or $3, $4, $3			; MIPS64-EB-NEXT: lwr $3, 3($1)
	; MIPS64-EB-NEXT: dsll $3, $3, 16			; MIPS64-EB-NEXT: dsll $3, $3, 32
	; MIPS64-EB-NEXT: or $2, $2, $3			; MIPS64-EB-NEXT: or $2, $3, $2
	; MIPS64-EB-NEXT: lbu $1, 6($1)			; MIPS64-EB-NEXT: lbu $1, 6($1)
	; MIPS64-EB-NEXT: dsll $1, $1, 8			; MIPS64-EB-NEXT: dsll $1, $1, 8
	; MIPS64-EB-NEXT: ld $25, %call16(extern_func)($gp)			; MIPS64-EB-NEXT: ld $25, %call16(extern_func)($gp)
	; MIPS64-EB-NEXT: .reloc .Ltmp0, R_MIPS_JALR, extern_func			; MIPS64-EB-NEXT: .reloc .Ltmp0, R_MIPS_JALR, extern_func
	; MIPS64-EB-NEXT: .Ltmp0:			; MIPS64-EB-NEXT: .Ltmp0:
	; MIPS64-EB-NEXT: jalr $25			; MIPS64-EB-NEXT: jalr $25
	; MIPS64-EB-NEXT: or $4, $2, $1			; MIPS64-EB-NEXT: or $4, $2, $1
	; MIPS64-EB-NEXT: ld $gp, 0($sp) # 8-byte Folded Reload			; MIPS64-EB-NEXT: ld $gp, 0($sp) # 8-byte Folded Reload
	; MIPS64-EB-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload			; MIPS64-EB-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
	; MIPS64-EB-NEXT: jr $ra			; MIPS64-EB-NEXT: jr $ra
	; MIPS64-EB-NEXT: daddiu $sp, $sp, 16			; MIPS64-EB-NEXT: daddiu $sp, $sp, 16
	;			;
	; MIPS64R2-EL-LABEL: pass_array_byval:			; MIPS64R2-EL-LABEL: pass_array_byval:
	; MIPS64R2-EL: # %bb.0: # %entry			; MIPS64R2-EL: # %bb.0: # %entry
	; MIPS64R2-EL-NEXT: daddiu $sp, $sp, -16			; MIPS64R2-EL-NEXT: daddiu $sp, $sp, -16
	; MIPS64R2-EL-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill			; MIPS64R2-EL-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
	; MIPS64R2-EL-NEXT: sd $gp, 0($sp) # 8-byte Folded Spill			; MIPS64R2-EL-NEXT: sd $gp, 0($sp) # 8-byte Folded Spill
	; MIPS64R2-EL-NEXT: lui $1, %hi(%neg(%gp_rel(pass_array_byval)))			; MIPS64R2-EL-NEXT: lui $1, %hi(%neg(%gp_rel(pass_array_byval)))
	; MIPS64R2-EL-NEXT: daddu $1, $1, $25			; MIPS64R2-EL-NEXT: daddu $1, $1, $25
	; MIPS64R2-EL-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(pass_array_byval)))			; MIPS64R2-EL-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(pass_array_byval)))
	; MIPS64R2-EL-NEXT: ld $1, %got_disp(arr)($gp)			; MIPS64R2-EL-NEXT: ld $1, %got_disp(arr)($gp)
	; MIPS64R2-EL-NEXT: lwl $2, 3($1)			; MIPS64R2-EL-NEXT: lbu $2, 4($1)
	; MIPS64R2-EL-NEXT: lwr $2, 0($1)			; MIPS64R2-EL-NEXT: dsll $2, $2, 32
	; MIPS64R2-EL-NEXT: dext $2, $2, 0, 32			; MIPS64R2-EL-NEXT: lbu $3, 5($1)
	; MIPS64R2-EL-NEXT: lbu $3, 4($1)			; MIPS64R2-EL-NEXT: dsll $3, $3, 40
	; MIPS64R2-EL-NEXT: lbu $4, 5($1)			; MIPS64R2-EL-NEXT: or $2, $3, $2
	; MIPS64R2-EL-NEXT: dsll $4, $4, 8			; MIPS64R2-EL-NEXT: lwl $3, 3($1)
	; MIPS64R2-EL-NEXT: or $3, $4, $3			; MIPS64R2-EL-NEXT: lwr $3, 0($1)
	; MIPS64R2-EL-NEXT: dsll $3, $3, 32			; MIPS64R2-EL-NEXT: dext $3, $3, 0, 32
	; MIPS64R2-EL-NEXT: or $2, $2, $3			; MIPS64R2-EL-NEXT: or $2, $3, $2
	; MIPS64R2-EL-NEXT: lbu $1, 6($1)			; MIPS64R2-EL-NEXT: lbu $1, 6($1)
	; MIPS64R2-EL-NEXT: dsll $1, $1, 48			; MIPS64R2-EL-NEXT: dsll $1, $1, 48
	; MIPS64R2-EL-NEXT: ld $25, %call16(extern_func)($gp)			; MIPS64R2-EL-NEXT: ld $25, %call16(extern_func)($gp)
	; MIPS64R2-EL-NEXT: .reloc .Ltmp0, R_MIPS_JALR, extern_func			; MIPS64R2-EL-NEXT: .reloc .Ltmp0, R_MIPS_JALR, extern_func
	; MIPS64R2-EL-NEXT: .Ltmp0:			; MIPS64R2-EL-NEXT: .Ltmp0:
	; MIPS64R2-EL-NEXT: jalr $25			; MIPS64R2-EL-NEXT: jalr $25
	; MIPS64R2-EL-NEXT: or $4, $2, $1			; MIPS64R2-EL-NEXT: or $4, $2, $1
	; MIPS64R2-EL-NEXT: ld $gp, 0($sp) # 8-byte Folded Reload			; MIPS64R2-EL-NEXT: ld $gp, 0($sp) # 8-byte Folded Reload
	; MIPS64R2-EL-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload			; MIPS64R2-EL-NEXT: ld $ra, 8($sp) # 8-byte Folded Reload
	; MIPS64R2-EL-NEXT: jr $ra			; MIPS64R2-EL-NEXT: jr $ra
	; MIPS64R2-EL-NEXT: daddiu $sp, $sp, 16			; MIPS64R2-EL-NEXT: daddiu $sp, $sp, 16
	;			;
	; MIPS64R2-EB-LABEL: pass_array_byval:			; MIPS64R2-EB-LABEL: pass_array_byval:
	; MIPS64R2-EB: # %bb.0: # %entry			; MIPS64R2-EB: # %bb.0: # %entry
	; MIPS64R2-EB-NEXT: daddiu $sp, $sp, -16			; MIPS64R2-EB-NEXT: daddiu $sp, $sp, -16
	; MIPS64R2-EB-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill			; MIPS64R2-EB-NEXT: sd $ra, 8($sp) # 8-byte Folded Spill
	; MIPS64R2-EB-NEXT: sd $gp, 0($sp) # 8-byte Folded Spill			; MIPS64R2-EB-NEXT: sd $gp, 0($sp) # 8-byte Folded Spill
	; MIPS64R2-EB-NEXT: lui $1, %hi(%neg(%gp_rel(pass_array_byval)))			; MIPS64R2-EB-NEXT: lui $1, %hi(%neg(%gp_rel(pass_array_byval)))
	; MIPS64R2-EB-NEXT: daddu $1, $1, $25			; MIPS64R2-EB-NEXT: daddu $1, $1, $25
	; MIPS64R2-EB-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(pass_array_byval)))			; MIPS64R2-EB-NEXT: daddiu $gp, $1, %lo(%neg(%gp_rel(pass_array_byval)))
	; MIPS64R2-EB-NEXT: ld $1, %got_disp(arr)($gp)			; MIPS64R2-EB-NEXT: ld $1, %got_disp(arr)($gp)
	; MIPS64R2-EB-NEXT: lbu $2, 5($1)			; MIPS64R2-EB-NEXT: lbu $2, 5($1)
				; MIPS64R2-EB-NEXT: dsll $2, $2, 16
	; MIPS64R2-EB-NEXT: lbu $3, 4($1)			; MIPS64R2-EB-NEXT: lbu $3, 4($1)
	; MIPS64R2-EB-NEXT: dsll $3, $3, 8			; MIPS64R2-EB-NEXT: dsll $3, $3, 24
	; MIPS64R2-EB-NEXT: or $2, $3, $2			; MIPS64R2-EB-NEXT: or $2, $3, $2
	; MIPS64R2-EB-NEXT: dsll $2, $2, 16
	; MIPS64R2-EB-NEXT: lwl $3, 0($1)			; MIPS64R2-EB-NEXT: lwl $3, 0($1)
	; MIPS64R2-EB-NEXT: lwr $3, 3($1)			; MIPS64R2-EB-NEXT: lwr $3, 3($1)
	; MIPS64R2-EB-NEXT: dext $3, $3, 0, 32			; MIPS64R2-EB-NEXT: dext $3, $3, 0, 32
	; MIPS64R2-EB-NEXT: dsll $3, $3, 32			; MIPS64R2-EB-NEXT: dsll $3, $3, 32
	; MIPS64R2-EB-NEXT: or $2, $3, $2			; MIPS64R2-EB-NEXT: or $2, $3, $2
	; MIPS64R2-EB-NEXT: lbu $1, 6($1)			; MIPS64R2-EB-NEXT: lbu $1, 6($1)
	; MIPS64R2-EB-NEXT: dsll $1, $1, 8			; MIPS64R2-EB-NEXT: dsll $1, $1, 8
	; MIPS64R2-EB-NEXT: ld $25, %call16(extern_func)($gp)			; MIPS64R2-EB-NEXT: ld $25, %call16(extern_func)($gp)
	Show All 14 Lines

llvm/test/CodeGen/Mips/unalignedload.ll

	Show All 37 Lines
	; MIPS32-EB: # %bb.0: # %entry			; MIPS32-EB: # %bb.0: # %entry
	; MIPS32-EB-NEXT: lui $2, %hi(_gp_disp)			; MIPS32-EB-NEXT: lui $2, %hi(_gp_disp)
	; MIPS32-EB-NEXT: addiu $2, $2, %lo(_gp_disp)			; MIPS32-EB-NEXT: addiu $2, $2, %lo(_gp_disp)
	; MIPS32-EB-NEXT: addiu $sp, $sp, -24			; MIPS32-EB-NEXT: addiu $sp, $sp, -24
	; MIPS32-EB-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill			; MIPS32-EB-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
	; MIPS32-EB-NEXT: addu $gp, $2, $25			; MIPS32-EB-NEXT: addu $gp, $2, $25
	; MIPS32-EB-NEXT: lw $1, %got(s2)($gp)			; MIPS32-EB-NEXT: lw $1, %got(s2)($gp)
	; MIPS32-EB-NEXT: lbu $2, 3($1)			; MIPS32-EB-NEXT: lbu $2, 3($1)
				; MIPS32-EB-NEXT: sll $2, $2, 16
	; MIPS32-EB-NEXT: lbu $1, 2($1)			; MIPS32-EB-NEXT: lbu $1, 2($1)
	; MIPS32-EB-NEXT: sll $1, $1, 8			; MIPS32-EB-NEXT: sll $1, $1, 24
	; MIPS32-EB-NEXT: or $1, $1, $2
	; MIPS32-EB-NEXT: lw $25, %call16(foo2)($gp)			; MIPS32-EB-NEXT: lw $25, %call16(foo2)($gp)
	; MIPS32-EB-NEXT: .reloc ($tmp0), R_MIPS_JALR, foo2			; MIPS32-EB-NEXT: .reloc ($tmp0), R_MIPS_JALR, foo2
	; MIPS32-EB-NEXT: $tmp0:			; MIPS32-EB-NEXT: $tmp0:
	; MIPS32-EB-NEXT: jalr $25			; MIPS32-EB-NEXT: jalr $25
	; MIPS32-EB-NEXT: sll $4, $1, 16			; MIPS32-EB-NEXT: or $4, $1, $2
	; MIPS32-EB-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload			; MIPS32-EB-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
	; MIPS32-EB-NEXT: jr $ra			; MIPS32-EB-NEXT: jr $ra
	; MIPS32-EB-NEXT: addiu $sp, $sp, 24			; MIPS32-EB-NEXT: addiu $sp, $sp, 24
	;			;
	; MIPS32R6-EL-LABEL: bar1:			; MIPS32R6-EL-LABEL: bar1:
	; MIPS32R6-EL: # %bb.0: # %entry			; MIPS32R6-EL: # %bb.0: # %entry
	; MIPS32R6-EL-NEXT: lui $2, %hi(_gp_disp)			; MIPS32R6-EL-NEXT: lui $2, %hi(_gp_disp)
	; MIPS32R6-EL-NEXT: addiu $2, $2, %lo(_gp_disp)			; MIPS32R6-EL-NEXT: addiu $2, $2, %lo(_gp_disp)
	▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
	; MIPS32-EB: # %bb.0: # %entry			; MIPS32-EB: # %bb.0: # %entry
	; MIPS32-EB-NEXT: lui $2, %hi(_gp_disp)			; MIPS32-EB-NEXT: lui $2, %hi(_gp_disp)
	; MIPS32-EB-NEXT: addiu $2, $2, %lo(_gp_disp)			; MIPS32-EB-NEXT: addiu $2, $2, %lo(_gp_disp)
	; MIPS32-EB-NEXT: addiu $sp, $sp, -24			; MIPS32-EB-NEXT: addiu $sp, $sp, -24
	; MIPS32-EB-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill			; MIPS32-EB-NEXT: sw $ra, 20($sp) # 4-byte Folded Spill
	; MIPS32-EB-NEXT: addu $gp, $2, $25			; MIPS32-EB-NEXT: addu $gp, $2, $25
	; MIPS32-EB-NEXT: lw $1, %got(s4)($gp)			; MIPS32-EB-NEXT: lw $1, %got(s4)($gp)
	; MIPS32-EB-NEXT: lwl $4, 0($1)			; MIPS32-EB-NEXT: lwl $4, 0($1)
	; MIPS32-EB-NEXT: lwr $4, 3($1)
	; MIPS32-EB-NEXT: lbu $2, 5($1)			; MIPS32-EB-NEXT: lbu $2, 5($1)
				; MIPS32-EB-NEXT: lwr $4, 3($1)
				; MIPS32-EB-NEXT: sll $2, $2, 16
	; MIPS32-EB-NEXT: lbu $3, 4($1)			; MIPS32-EB-NEXT: lbu $3, 4($1)
	; MIPS32-EB-NEXT: sll $3, $3, 8			; MIPS32-EB-NEXT: sll $3, $3, 24
	; MIPS32-EB-NEXT: or $2, $3, $2			; MIPS32-EB-NEXT: or $2, $3, $2
	; MIPS32-EB-NEXT: sll $2, $2, 16
	; MIPS32-EB-NEXT: lbu $1, 6($1)			; MIPS32-EB-NEXT: lbu $1, 6($1)
	; MIPS32-EB-NEXT: sll $1, $1, 8			; MIPS32-EB-NEXT: sll $1, $1, 8
	; MIPS32-EB-NEXT: lw $25, %call16(foo4)($gp)			; MIPS32-EB-NEXT: lw $25, %call16(foo4)($gp)
	; MIPS32-EB-NEXT: .reloc ($tmp1), R_MIPS_JALR, foo4			; MIPS32-EB-NEXT: .reloc ($tmp1), R_MIPS_JALR, foo4
	; MIPS32-EB-NEXT: $tmp1:			; MIPS32-EB-NEXT: $tmp1:
	; MIPS32-EB-NEXT: jalr $25			; MIPS32-EB-NEXT: jalr $25
	; MIPS32-EB-NEXT: or $5, $2, $1			; MIPS32-EB-NEXT: or $5, $2, $1
	; MIPS32-EB-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload			; MIPS32-EB-NEXT: lw $ra, 20($sp) # 4-byte Folded Reload
	▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/bswap-bitreverse.ll

Show First 20 Lines • Show All 1,493 Lines • ▼ Show 20 Lines	; RV64ZBKB-NEXT: ret
%tmp = call i64 @llvm.bitreverse.i64(i64 %a)		%tmp = call i64 @llvm.bitreverse.i64(i64 %a)
%tmp2 = call i64 @llvm.bswap.i64(i64 %tmp)		%tmp2 = call i64 @llvm.bswap.i64(i64 %tmp)
ret i64 %tmp2		ret i64 %tmp2
}		}

define i32 @pr55484(i32 %0) {		define i32 @pr55484(i32 %0) {
; RV32I-LABEL: pr55484:		; RV32I-LABEL: pr55484:
; RV32I: # %bb.0:		; RV32I: # %bb.0:
; RV32I-NEXT: srli a1, a0, 8		; RV32I-NEXT: slli a1, a0, 8
; RV32I-NEXT: slli a0, a0, 8		; RV32I-NEXT: slli a0, a0, 24
; RV32I-NEXT: or a0, a1, a0		; RV32I-NEXT: or a0, a0, a1
; RV32I-NEXT: slli a0, a0, 16
; RV32I-NEXT: srai a0, a0, 16		; RV32I-NEXT: srai a0, a0, 16
; RV32I-NEXT: ret		; RV32I-NEXT: ret
;		;
; RV64I-LABEL: pr55484:		; RV64I-LABEL: pr55484:
; RV64I: # %bb.0:		; RV64I: # %bb.0:
; RV64I-NEXT: srli a1, a0, 8		; RV64I-NEXT: slli a1, a0, 40
; RV64I-NEXT: slli a0, a0, 8		; RV64I-NEXT: slli a0, a0, 56
; RV64I-NEXT: or a0, a1, a0		; RV64I-NEXT: or a0, a0, a1
; RV64I-NEXT: slli a0, a0, 48
; RV64I-NEXT: srai a0, a0, 48		; RV64I-NEXT: srai a0, a0, 48
; RV64I-NEXT: ret		; RV64I-NEXT: ret
;		;
; RV32ZBB-LABEL: pr55484:		; RV32ZBB-LABEL: pr55484:
; RV32ZBB: # %bb.0:		; RV32ZBB: # %bb.0:
; RV32ZBB-NEXT: srli a1, a0, 8		; RV32ZBB-NEXT: srli a1, a0, 8
; RV32ZBB-NEXT: slli a0, a0, 8		; RV32ZBB-NEXT: slli a0, a0, 8
; RV32ZBB-NEXT: or a0, a1, a0		; RV32ZBB-NEXT: or a0, a1, a0
; RV32ZBB-NEXT: sext.h a0, a0		; RV32ZBB-NEXT: sext.h a0, a0
; RV32ZBB-NEXT: ret		; RV32ZBB-NEXT: ret
;		;
; RV64ZBB-LABEL: pr55484:		; RV64ZBB-LABEL: pr55484:
; RV64ZBB: # %bb.0:		; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: srli a1, a0, 8		; RV64ZBB-NEXT: srli a1, a0, 8
; RV64ZBB-NEXT: slli a0, a0, 8		; RV64ZBB-NEXT: slli a0, a0, 8
; RV64ZBB-NEXT: or a0, a1, a0		; RV64ZBB-NEXT: or a0, a1, a0
; RV64ZBB-NEXT: sext.h a0, a0		; RV64ZBB-NEXT: sext.h a0, a0
; RV64ZBB-NEXT: ret		; RV64ZBB-NEXT: ret
;		;
; RV32ZBKB-LABEL: pr55484:		; RV32ZBKB-LABEL: pr55484:
; RV32ZBKB: # %bb.0:		; RV32ZBKB: # %bb.0:
; RV32ZBKB-NEXT: srli a1, a0, 8		; RV32ZBKB-NEXT: slli a1, a0, 8
; RV32ZBKB-NEXT: slli a0, a0, 8		; RV32ZBKB-NEXT: slli a0, a0, 24
; RV32ZBKB-NEXT: or a0, a1, a0		; RV32ZBKB-NEXT: or a0, a0, a1
; RV32ZBKB-NEXT: slli a0, a0, 16
; RV32ZBKB-NEXT: srai a0, a0, 16		; RV32ZBKB-NEXT: srai a0, a0, 16
; RV32ZBKB-NEXT: ret		; RV32ZBKB-NEXT: ret
;		;
; RV64ZBKB-LABEL: pr55484:		; RV64ZBKB-LABEL: pr55484:
; RV64ZBKB: # %bb.0:		; RV64ZBKB: # %bb.0:
; RV64ZBKB-NEXT: srli a1, a0, 8		; RV64ZBKB-NEXT: slli a1, a0, 40
; RV64ZBKB-NEXT: slli a0, a0, 8		; RV64ZBKB-NEXT: slli a0, a0, 56
; RV64ZBKB-NEXT: or a0, a1, a0		; RV64ZBKB-NEXT: or a0, a0, a1
; RV64ZBKB-NEXT: slli a0, a0, 48
; RV64ZBKB-NEXT: srai a0, a0, 48		; RV64ZBKB-NEXT: srai a0, a0, 48
; RV64ZBKB-NEXT: ret		; RV64ZBKB-NEXT: ret
%2 = lshr i32 %0, 8		%2 = lshr i32 %0, 8
%3 = shl i32 %0, 8		%3 = shl i32 %0, 8
%4 = or i32 %2, %3		%4 = or i32 %2, %3
%5 = trunc i32 %4 to i16		%5 = trunc i32 %4 to i16
%6 = sext i16 %5 to i32		%6 = sext i16 %5 to i32
ret i32 %6		ret i32 %6
}		}

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-unaligned.ll

	Show First 20 Lines • Show All 415 Lines • ▼ Show 20 Lines
	; RV32-NEXT: vmseq.vi v8, v8, 0			; RV32-NEXT: vmseq.vi v8, v8, 0
	; RV32-NEXT: vsetvli zero, zero, e8, mf8, ta, ma			; RV32-NEXT: vsetvli zero, zero, e8, mf8, ta, ma
	; RV32-NEXT: vmv.x.s a2, v8			; RV32-NEXT: vmv.x.s a2, v8
	; RV32-NEXT: andi a3, a2, 1			; RV32-NEXT: andi a3, a2, 1
	; RV32-NEXT: beqz a3, .LBB8_2			; RV32-NEXT: beqz a3, .LBB8_2
	; RV32-NEXT: # %bb.1: # %cond.load			; RV32-NEXT: # %bb.1: # %cond.load
	; RV32-NEXT: lbu a3, 1(a0)			; RV32-NEXT: lbu a3, 1(a0)
	; RV32-NEXT: lbu a4, 0(a0)			; RV32-NEXT: lbu a4, 0(a0)
	; RV32-NEXT: lbu a5, 3(a0)			; RV32-NEXT: lbu a5, 2(a0)
	; RV32-NEXT: lbu a6, 2(a0)			; RV32-NEXT: lbu a6, 3(a0)
	; RV32-NEXT: slli a3, a3, 8			; RV32-NEXT: slli a3, a3, 8
	; RV32-NEXT: or a3, a3, a4			; RV32-NEXT: or a3, a3, a4
	; RV32-NEXT: slli a4, a5, 8			; RV32-NEXT: slli a4, a5, 16
	; RV32-NEXT: or a4, a4, a6			; RV32-NEXT: slli a5, a6, 24
	; RV32-NEXT: slli a4, a4, 16			; RV32-NEXT: or a4, a5, a4
	; RV32-NEXT: or a3, a4, a3			; RV32-NEXT: or a3, a4, a3
	; RV32-NEXT: vsetivli zero, 2, e32, mf2, ta, ma			; RV32-NEXT: vsetivli zero, 2, e32, mf2, ta, ma
	; RV32-NEXT: vmv.v.x v8, a3			; RV32-NEXT: vmv.v.x v8, a3
	; RV32-NEXT: andi a2, a2, 2			; RV32-NEXT: andi a2, a2, 2
	; RV32-NEXT: bnez a2, .LBB8_3			; RV32-NEXT: bnez a2, .LBB8_3
	; RV32-NEXT: j .LBB8_4			; RV32-NEXT: j .LBB8_4
	; RV32-NEXT: .LBB8_2:			; RV32-NEXT: .LBB8_2:
	; RV32-NEXT: vsetivli zero, 2, e32, mf2, ta, ma			; RV32-NEXT: vsetivli zero, 2, e32, mf2, ta, ma
	; RV32-NEXT: vmv.v.i v8, 0			; RV32-NEXT: vmv.v.i v8, 0
	; RV32-NEXT: andi a2, a2, 2			; RV32-NEXT: andi a2, a2, 2
	; RV32-NEXT: beqz a2, .LBB8_4			; RV32-NEXT: beqz a2, .LBB8_4
	; RV32-NEXT: .LBB8_3: # %cond.load1			; RV32-NEXT: .LBB8_3: # %cond.load1
	; RV32-NEXT: lbu a2, 5(a0)			; RV32-NEXT: lbu a2, 5(a0)
	; RV32-NEXT: lbu a3, 4(a0)			; RV32-NEXT: lbu a3, 4(a0)
	; RV32-NEXT: lbu a4, 7(a0)			; RV32-NEXT: lbu a4, 6(a0)
	; RV32-NEXT: lbu a0, 6(a0)			; RV32-NEXT: lbu a0, 7(a0)
	; RV32-NEXT: slli a2, a2, 8			; RV32-NEXT: slli a2, a2, 8
	; RV32-NEXT: or a2, a2, a3			; RV32-NEXT: or a2, a2, a3
	; RV32-NEXT: slli a3, a4, 8			; RV32-NEXT: slli a3, a4, 16
	; RV32-NEXT: or a0, a3, a0			; RV32-NEXT: slli a0, a0, 24
	; RV32-NEXT: slli a0, a0, 16			; RV32-NEXT: or a0, a0, a3
	; RV32-NEXT: or a0, a0, a2			; RV32-NEXT: or a0, a0, a2
	; RV32-NEXT: vmv.s.x v9, a0			; RV32-NEXT: vmv.s.x v9, a0
	; RV32-NEXT: vsetvli zero, zero, e32, mf2, tu, ma			; RV32-NEXT: vsetvli zero, zero, e32, mf2, tu, ma
	; RV32-NEXT: vslideup.vi v8, v9, 1			; RV32-NEXT: vslideup.vi v8, v9, 1
	; RV32-NEXT: .LBB8_4: # %else2			; RV32-NEXT: .LBB8_4: # %else2
	; RV32-NEXT: vsetvli zero, zero, e32, mf2, ta, ma			; RV32-NEXT: vsetvli zero, zero, e32, mf2, ta, ma
	; RV32-NEXT: vse32.v v8, (a1)			; RV32-NEXT: vse32.v v8, (a1)
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: masked_load_v2i32_align1:			; RV64-LABEL: masked_load_v2i32_align1:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: vsetivli zero, 2, e32, mf2, ta, ma			; RV64-NEXT: vsetivli zero, 2, e32, mf2, ta, ma
	; RV64-NEXT: vmseq.vi v8, v8, 0			; RV64-NEXT: vmseq.vi v8, v8, 0
	; RV64-NEXT: vsetvli zero, zero, e8, mf8, ta, ma			; RV64-NEXT: vsetvli zero, zero, e8, mf8, ta, ma
	; RV64-NEXT: vmv.x.s a2, v8			; RV64-NEXT: vmv.x.s a2, v8
	; RV64-NEXT: andi a3, a2, 1			; RV64-NEXT: andi a3, a2, 1
	; RV64-NEXT: beqz a3, .LBB8_2			; RV64-NEXT: beqz a3, .LBB8_2
	; RV64-NEXT: # %bb.1: # %cond.load			; RV64-NEXT: # %bb.1: # %cond.load
	; RV64-NEXT: lbu a3, 1(a0)			; RV64-NEXT: lbu a3, 1(a0)
	; RV64-NEXT: lbu a4, 0(a0)			; RV64-NEXT: lbu a4, 0(a0)
	; RV64-NEXT: lb a5, 3(a0)			; RV64-NEXT: lbu a5, 2(a0)
	; RV64-NEXT: lbu a6, 2(a0)			; RV64-NEXT: lb a6, 3(a0)
	; RV64-NEXT: slli a3, a3, 8			; RV64-NEXT: slli a3, a3, 8
	; RV64-NEXT: or a3, a3, a4			; RV64-NEXT: or a3, a3, a4
	; RV64-NEXT: slli a4, a5, 8			; RV64-NEXT: slli a4, a5, 16
	; RV64-NEXT: or a4, a4, a6			; RV64-NEXT: slli a5, a6, 24
	; RV64-NEXT: slli a4, a4, 16			; RV64-NEXT: or a4, a5, a4
	; RV64-NEXT: or a3, a4, a3			; RV64-NEXT: or a3, a4, a3
	; RV64-NEXT: vsetivli zero, 2, e32, mf2, ta, ma			; RV64-NEXT: vsetivli zero, 2, e32, mf2, ta, ma
	; RV64-NEXT: vmv.v.x v8, a3			; RV64-NEXT: vmv.v.x v8, a3
	; RV64-NEXT: andi a2, a2, 2			; RV64-NEXT: andi a2, a2, 2
	; RV64-NEXT: bnez a2, .LBB8_3			; RV64-NEXT: bnez a2, .LBB8_3
	; RV64-NEXT: j .LBB8_4			; RV64-NEXT: j .LBB8_4
	; RV64-NEXT: .LBB8_2:			; RV64-NEXT: .LBB8_2:
	; RV64-NEXT: vsetivli zero, 2, e32, mf2, ta, ma			; RV64-NEXT: vsetivli zero, 2, e32, mf2, ta, ma
	; RV64-NEXT: vmv.v.i v8, 0			; RV64-NEXT: vmv.v.i v8, 0
	; RV64-NEXT: andi a2, a2, 2			; RV64-NEXT: andi a2, a2, 2
	; RV64-NEXT: beqz a2, .LBB8_4			; RV64-NEXT: beqz a2, .LBB8_4
	; RV64-NEXT: .LBB8_3: # %cond.load1			; RV64-NEXT: .LBB8_3: # %cond.load1
	; RV64-NEXT: lbu a2, 5(a0)			; RV64-NEXT: lbu a2, 5(a0)
	; RV64-NEXT: lbu a3, 4(a0)			; RV64-NEXT: lbu a3, 4(a0)
	; RV64-NEXT: lb a4, 7(a0)			; RV64-NEXT: lbu a4, 6(a0)
	; RV64-NEXT: lbu a0, 6(a0)			; RV64-NEXT: lb a0, 7(a0)
	; RV64-NEXT: slli a2, a2, 8			; RV64-NEXT: slli a2, a2, 8
	; RV64-NEXT: or a2, a2, a3			; RV64-NEXT: or a2, a2, a3
	; RV64-NEXT: slli a3, a4, 8			; RV64-NEXT: slli a3, a4, 16
	; RV64-NEXT: or a0, a3, a0			; RV64-NEXT: slli a0, a0, 24
	; RV64-NEXT: slli a0, a0, 16			; RV64-NEXT: or a0, a0, a3
	; RV64-NEXT: or a0, a0, a2			; RV64-NEXT: or a0, a0, a2
	; RV64-NEXT: vmv.s.x v9, a0			; RV64-NEXT: vmv.s.x v9, a0
	; RV64-NEXT: vsetvli zero, zero, e32, mf2, tu, ma			; RV64-NEXT: vsetvli zero, zero, e32, mf2, tu, ma
	; RV64-NEXT: vslideup.vi v8, v9, 1			; RV64-NEXT: vslideup.vi v8, v9, 1
	; RV64-NEXT: .LBB8_4: # %else2			; RV64-NEXT: .LBB8_4: # %else2
	; RV64-NEXT: vsetvli zero, zero, e32, mf2, ta, ma			; RV64-NEXT: vsetvli zero, zero, e32, mf2, ta, ma
	; RV64-NEXT: vse32.v v8, (a1)			; RV64-NEXT: vse32.v v8, (a1)
	; RV64-NEXT: ret			; RV64-NEXT: ret
	▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll

	Show First 20 Lines • Show All 386 Lines • ▼ Show 20 Lines
	; RV64-NEXT: sd s0, 32(sp) # 8-byte Folded Spill			; RV64-NEXT: sd s0, 32(sp) # 8-byte Folded Spill
	; RV64-NEXT: sd s1, 24(sp) # 8-byte Folded Spill			; RV64-NEXT: sd s1, 24(sp) # 8-byte Folded Spill
	; RV64-NEXT: sd s2, 16(sp) # 8-byte Folded Spill			; RV64-NEXT: sd s2, 16(sp) # 8-byte Folded Spill
	; RV64-NEXT: sd s3, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: sd s3, 8(sp) # 8-byte Folded Spill
	; RV64-NEXT: mv s0, a0			; RV64-NEXT: mv s0, a0
	; RV64-NEXT: lb a0, 12(a0)			; RV64-NEXT: lb a0, 12(a0)
	; RV64-NEXT: lwu a1, 8(s0)			; RV64-NEXT: lwu a1, 8(s0)
	; RV64-NEXT: slli a0, a0, 32			; RV64-NEXT: slli a0, a0, 32
	; RV64-NEXT: or a0, a1, a0
	; RV64-NEXT: ld a2, 0(s0)			; RV64-NEXT: ld a2, 0(s0)
				; RV64-NEXT: or a0, a1, a0
	; RV64-NEXT: slli a0, a0, 29			; RV64-NEXT: slli a0, a0, 29
	; RV64-NEXT: srai s1, a0, 31			; RV64-NEXT: srai s1, a0, 31
	; RV64-NEXT: slli a0, a1, 31			; RV64-NEXT: srli a0, a2, 2
	; RV64-NEXT: srli a1, a2, 33			; RV64-NEXT: slli a1, a1, 62
	; RV64-NEXT: or a0, a1, a0			; RV64-NEXT: or a0, a1, a0
	; RV64-NEXT: slli a0, a0, 31
	; RV64-NEXT: srai a0, a0, 31			; RV64-NEXT: srai a0, a0, 31
	; RV64-NEXT: slli a1, a2, 31			; RV64-NEXT: slli a1, a2, 31
	; RV64-NEXT: srai s2, a1, 31			; RV64-NEXT: srai s2, a1, 31
	; RV64-NEXT: li a1, 7			; RV64-NEXT: li a1, 7
	; RV64-NEXT: call __moddi3@plt			; RV64-NEXT: call __moddi3@plt
	; RV64-NEXT: mv s3, a0			; RV64-NEXT: mv s3, a0
	; RV64-NEXT: li a1, -5			; RV64-NEXT: li a1, -5
	; RV64-NEXT: mv a0, s1			; RV64-NEXT: mv a0, s1
	Show All 12 Lines
	; RV64-NEXT: sltu a0, a1, a0			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: addi a1, s1, -2			; RV64-NEXT: addi a1, s1, -2
	; RV64-NEXT: seqz a1, a1			; RV64-NEXT: seqz a1, a1
	; RV64-NEXT: addi a2, s3, -1			; RV64-NEXT: addi a2, s3, -1
	; RV64-NEXT: seqz a2, a2			; RV64-NEXT: seqz a2, a2
	; RV64-NEXT: neg a0, a0			; RV64-NEXT: neg a0, a0
	; RV64-NEXT: addi a2, a2, -1			; RV64-NEXT: addi a2, a2, -1
	; RV64-NEXT: addi a1, a1, -1			; RV64-NEXT: addi a1, a1, -1
	; RV64-NEXT: slli a3, a1, 29			; RV64-NEXT: slli a3, a1, 2
	; RV64-NEXT: srli a3, a3, 61			; RV64-NEXT: slli a4, a2, 31
	; RV64-NEXT: sb a3, 12(s0)			; RV64-NEXT: srli a4, a4, 62
	; RV64-NEXT: slli a1, a1, 2			; RV64-NEXT: or a3, a4, a3
	; RV64-NEXT: slli a3, a2, 31			; RV64-NEXT: sw a3, 8(s0)
	; RV64-NEXT: srli a3, a3, 62			; RV64-NEXT: slli a1, a1, 29
	; RV64-NEXT: or a1, a3, a1			; RV64-NEXT: srli a1, a1, 61
	; RV64-NEXT: sw a1, 8(s0)			; RV64-NEXT: sb a1, 12(s0)
	; RV64-NEXT: slli a0, a0, 31			; RV64-NEXT: slli a0, a0, 31
	; RV64-NEXT: srli a0, a0, 31			; RV64-NEXT: srli a0, a0, 31
	; RV64-NEXT: slli a1, a2, 33			; RV64-NEXT: slli a1, a2, 33
	; RV64-NEXT: or a0, a0, a1			; RV64-NEXT: or a0, a0, a1
	; RV64-NEXT: sd a0, 0(s0)			; RV64-NEXT: sd a0, 0(s0)
	; RV64-NEXT: ld ra, 40(sp) # 8-byte Folded Reload			; RV64-NEXT: ld ra, 40(sp) # 8-byte Folded Reload
	; RV64-NEXT: ld s0, 32(sp) # 8-byte Folded Reload			; RV64-NEXT: ld s0, 32(sp) # 8-byte Folded Reload
	; RV64-NEXT: ld s1, 24(sp) # 8-byte Folded Reload			; RV64-NEXT: ld s1, 24(sp) # 8-byte Folded Reload
	▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
	; RV32M-NEXT: lw s4, 8(sp) # 4-byte Folded Reload			; RV32M-NEXT: lw s4, 8(sp) # 4-byte Folded Reload
	; RV32M-NEXT: lw s5, 4(sp) # 4-byte Folded Reload			; RV32M-NEXT: lw s5, 4(sp) # 4-byte Folded Reload
	; RV32M-NEXT: lw s6, 0(sp) # 4-byte Folded Reload			; RV32M-NEXT: lw s6, 0(sp) # 4-byte Folded Reload
	; RV32M-NEXT: addi sp, sp, 32			; RV32M-NEXT: addi sp, sp, 32
	; RV32M-NEXT: ret			; RV32M-NEXT: ret
	;			;
	; RV64M-LABEL: test_srem_vec:			; RV64M-LABEL: test_srem_vec:
	; RV64M: # %bb.0:			; RV64M: # %bb.0:
	; RV64M-NEXT: lb a1, 12(a0)			; RV64M-NEXT: ld a1, 0(a0)
	; RV64M-NEXT: lwu a2, 8(a0)			; RV64M-NEXT: lwu a2, 8(a0)
	; RV64M-NEXT: slli a1, a1, 32			; RV64M-NEXT: srli a3, a1, 2
	; RV64M-NEXT: or a1, a2, a1			; RV64M-NEXT: lb a4, 12(a0)
	; RV64M-NEXT: ld a3, 0(a0)			; RV64M-NEXT: slli a5, a2, 62
	; RV64M-NEXT: slli a1, a1, 29			; RV64M-NEXT: or a3, a5, a3
	; RV64M-NEXT: srai a1, a1, 31			; RV64M-NEXT: srai a3, a3, 31
	; RV64M-NEXT: slli a2, a2, 31			; RV64M-NEXT: slli a4, a4, 32
	; RV64M-NEXT: srli a4, a3, 33
	; RV64M-NEXT: lui a5, %hi(.LCPI3_0)			; RV64M-NEXT: lui a5, %hi(.LCPI3_0)
	; RV64M-NEXT: ld a5, %lo(.LCPI3_0)(a5)			; RV64M-NEXT: ld a5, %lo(.LCPI3_0)(a5)
	; RV64M-NEXT: or a2, a4, a2			; RV64M-NEXT: or a2, a2, a4
	; RV64M-NEXT: slli a2, a2, 31			; RV64M-NEXT: slli a2, a2, 29
	; RV64M-NEXT: srai a2, a2, 31			; RV64M-NEXT: srai a2, a2, 31
	; RV64M-NEXT: mulh a4, a2, a5			; RV64M-NEXT: mulh a4, a2, a5
	; RV64M-NEXT: srli a5, a4, 63			; RV64M-NEXT: srli a5, a4, 63
	; RV64M-NEXT: srai a4, a4, 1			; RV64M-NEXT: srai a4, a4, 1
	; RV64M-NEXT: add a4, a4, a5			; RV64M-NEXT: add a4, a4, a5
	; RV64M-NEXT: slli a5, a4, 3			; RV64M-NEXT: slli a5, a4, 2
	; RV64M-NEXT: sub a4, a4, a5			; RV64M-NEXT: add a4, a5, a4
	; RV64M-NEXT: lui a5, %hi(.LCPI3_1)			; RV64M-NEXT: lui a5, %hi(.LCPI3_1)
	; RV64M-NEXT: ld a5, %lo(.LCPI3_1)(a5)			; RV64M-NEXT: ld a5, %lo(.LCPI3_1)(a5)
	; RV64M-NEXT: slli a3, a3, 31			; RV64M-NEXT: slli a1, a1, 31
	; RV64M-NEXT: srai a3, a3, 31			; RV64M-NEXT: srai a1, a1, 31
	; RV64M-NEXT: add a2, a2, a4			; RV64M-NEXT: add a2, a2, a4
	; RV64M-NEXT: mulh a4, a1, a5			; RV64M-NEXT: mulh a4, a3, a5
	; RV64M-NEXT: srli a5, a4, 63			; RV64M-NEXT: srli a5, a4, 63
	; RV64M-NEXT: srai a4, a4, 1			; RV64M-NEXT: srai a4, a4, 1
	; RV64M-NEXT: add a4, a4, a5			; RV64M-NEXT: add a4, a4, a5
	; RV64M-NEXT: slli a5, a4, 2			; RV64M-NEXT: slli a5, a4, 3
	; RV64M-NEXT: add a4, a5, a4			; RV64M-NEXT: sub a4, a4, a5
	; RV64M-NEXT: add a1, a1, a4			; RV64M-NEXT: add a3, a3, a4
	; RV64M-NEXT: addi a1, a1, -2			; RV64M-NEXT: addi a3, a3, -1
	; RV64M-NEXT: seqz a1, a1			; RV64M-NEXT: seqz a3, a3
	; RV64M-NEXT: lui a4, %hi(.LCPI3_2)			; RV64M-NEXT: lui a4, %hi(.LCPI3_2)
	; RV64M-NEXT: ld a4, %lo(.LCPI3_2)(a4)			; RV64M-NEXT: ld a4, %lo(.LCPI3_2)(a4)
	; RV64M-NEXT: lui a5, %hi(.LCPI3_3)			; RV64M-NEXT: lui a5, %hi(.LCPI3_3)
	; RV64M-NEXT: ld a5, %lo(.LCPI3_3)(a5)			; RV64M-NEXT: ld a5, %lo(.LCPI3_3)(a5)
	; RV64M-NEXT: addi a2, a2, -1			; RV64M-NEXT: addi a2, a2, -2
	; RV64M-NEXT: seqz a2, a2			; RV64M-NEXT: seqz a2, a2
	; RV64M-NEXT: mul a3, a3, a4			; RV64M-NEXT: mul a1, a1, a4
	; RV64M-NEXT: add a3, a3, a5			; RV64M-NEXT: add a1, a1, a5
	; RV64M-NEXT: slli a4, a3, 63			; RV64M-NEXT: slli a4, a1, 63
	; RV64M-NEXT: srli a3, a3, 1			; RV64M-NEXT: srli a1, a1, 1
	; RV64M-NEXT: or a3, a3, a4			; RV64M-NEXT: or a1, a1, a4
	; RV64M-NEXT: sltu a3, a5, a3			; RV64M-NEXT: sltu a1, a5, a1
	; RV64M-NEXT: addi a2, a2, -1			; RV64M-NEXT: addi a2, a2, -1
	; RV64M-NEXT: addi a1, a1, -1			; RV64M-NEXT: addi a3, a3, -1
	; RV64M-NEXT: neg a3, a3			; RV64M-NEXT: neg a1, a1
	; RV64M-NEXT: slli a4, a1, 29			; RV64M-NEXT: slli a4, a3, 33
	; RV64M-NEXT: srli a4, a4, 61			; RV64M-NEXT: slli a1, a1, 31
	; RV64M-NEXT: sb a4, 12(a0)			; RV64M-NEXT: srli a1, a1, 31
	; RV64M-NEXT: slli a4, a2, 33			; RV64M-NEXT: or a1, a1, a4
				; RV64M-NEXT: sd a1, 0(a0)
				; RV64M-NEXT: slli a1, a2, 2
	; RV64M-NEXT: slli a3, a3, 31			; RV64M-NEXT: slli a3, a3, 31
	; RV64M-NEXT: srli a3, a3, 31			; RV64M-NEXT: srli a3, a3, 62
	; RV64M-NEXT: or a3, a3, a4			; RV64M-NEXT: or a1, a3, a1
	; RV64M-NEXT: sd a3, 0(a0)
	; RV64M-NEXT: slli a1, a1, 2
	; RV64M-NEXT: slli a2, a2, 31
	; RV64M-NEXT: srli a2, a2, 62
	; RV64M-NEXT: or a1, a2, a1
	; RV64M-NEXT: sw a1, 8(a0)			; RV64M-NEXT: sw a1, 8(a0)
				; RV64M-NEXT: slli a1, a2, 29
				; RV64M-NEXT: srli a1, a1, 61
				; RV64M-NEXT: sb a1, 12(a0)
	; RV64M-NEXT: ret			; RV64M-NEXT: ret
	;			;
	; RV32MV-LABEL: test_srem_vec:			; RV32MV-LABEL: test_srem_vec:
	; RV32MV: # %bb.0:			; RV32MV: # %bb.0:
	; RV32MV-NEXT: addi sp, sp, -64			; RV32MV-NEXT: addi sp, sp, -64
	; RV32MV-NEXT: sw ra, 60(sp) # 4-byte Folded Spill			; RV32MV-NEXT: sw ra, 60(sp) # 4-byte Folded Spill
	; RV32MV-NEXT: sw s0, 56(sp) # 4-byte Folded Spill			; RV32MV-NEXT: sw s0, 56(sp) # 4-byte Folded Spill
	; RV32MV-NEXT: sw s2, 52(sp) # 4-byte Folded Spill			; RV32MV-NEXT: sw s2, 52(sp) # 4-byte Folded Spill
	▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines
	;			;
	; RV64MV-LABEL: test_srem_vec:			; RV64MV-LABEL: test_srem_vec:
	; RV64MV: # %bb.0:			; RV64MV: # %bb.0:
	; RV64MV-NEXT: addi sp, sp, -64			; RV64MV-NEXT: addi sp, sp, -64
	; RV64MV-NEXT: sd ra, 56(sp) # 8-byte Folded Spill			; RV64MV-NEXT: sd ra, 56(sp) # 8-byte Folded Spill
	; RV64MV-NEXT: sd s0, 48(sp) # 8-byte Folded Spill			; RV64MV-NEXT: sd s0, 48(sp) # 8-byte Folded Spill
	; RV64MV-NEXT: addi s0, sp, 64			; RV64MV-NEXT: addi s0, sp, 64
	; RV64MV-NEXT: andi sp, sp, -32			; RV64MV-NEXT: andi sp, sp, -32
	; RV64MV-NEXT: lwu a1, 8(a0)			; RV64MV-NEXT: lb a1, 12(a0)
	; RV64MV-NEXT: ld a2, 0(a0)			; RV64MV-NEXT: lwu a2, 8(a0)
	; RV64MV-NEXT: slli a3, a1, 31			; RV64MV-NEXT: slli a1, a1, 32
	; RV64MV-NEXT: srli a4, a2, 33			; RV64MV-NEXT: ld a3, 0(a0)
	; RV64MV-NEXT: lb a5, 12(a0)			; RV64MV-NEXT: or a1, a2, a1
	; RV64MV-NEXT: or a3, a4, a3			; RV64MV-NEXT: slli a1, a1, 29
				; RV64MV-NEXT: srai a1, a1, 31
				; RV64MV-NEXT: srli a4, a3, 2
				; RV64MV-NEXT: slli a2, a2, 62
				; RV64MV-NEXT: lui a5, %hi(.LCPI3_0)
				; RV64MV-NEXT: ld a5, %lo(.LCPI3_0)(a5)
				; RV64MV-NEXT: or a2, a2, a4
	; RV64MV-NEXT: slli a3, a3, 31			; RV64MV-NEXT: slli a3, a3, 31
	; RV64MV-NEXT: srai a3, a3, 31			; RV64MV-NEXT: srai a3, a3, 31
	; RV64MV-NEXT: slli a4, a5, 32			; RV64MV-NEXT: mulh a4, a3, a5
	; RV64MV-NEXT: or a1, a1, a4
	; RV64MV-NEXT: lui a4, %hi(.LCPI3_0)
	; RV64MV-NEXT: ld a4, %lo(.LCPI3_0)(a4)
	; RV64MV-NEXT: slli a1, a1, 29
	; RV64MV-NEXT: slli a2, a2, 31
	; RV64MV-NEXT: srai a2, a2, 31
	; RV64MV-NEXT: mulh a4, a2, a4
	; RV64MV-NEXT: srli a5, a4, 63			; RV64MV-NEXT: srli a5, a4, 63
	; RV64MV-NEXT: add a4, a4, a5			; RV64MV-NEXT: add a4, a4, a5
	; RV64MV-NEXT: li a5, 6			; RV64MV-NEXT: li a5, 6
	; RV64MV-NEXT: mul a4, a4, a5			; RV64MV-NEXT: mul a4, a4, a5
	; RV64MV-NEXT: lui a5, %hi(.LCPI3_1)			; RV64MV-NEXT: lui a5, %hi(.LCPI3_1)
	; RV64MV-NEXT: ld a5, %lo(.LCPI3_1)(a5)			; RV64MV-NEXT: ld a5, %lo(.LCPI3_1)(a5)
	; RV64MV-NEXT: srai a1, a1, 31			; RV64MV-NEXT: srai a2, a2, 31
	; RV64MV-NEXT: sub a2, a2, a4			; RV64MV-NEXT: sub a3, a3, a4
	; RV64MV-NEXT: sd a2, 0(sp)			; RV64MV-NEXT: sd a3, 0(sp)
	; RV64MV-NEXT: mulh a2, a1, a5			; RV64MV-NEXT: mulh a3, a2, a5
	; RV64MV-NEXT: srli a4, a2, 63			; RV64MV-NEXT: srli a4, a3, 63
	; RV64MV-NEXT: srai a2, a2, 1			; RV64MV-NEXT: srai a3, a3, 1
	; RV64MV-NEXT: add a2, a2, a4			; RV64MV-NEXT: add a3, a3, a4
	; RV64MV-NEXT: slli a4, a2, 2			; RV64MV-NEXT: slli a4, a3, 3
	; RV64MV-NEXT: lui a5, %hi(.LCPI3_2)			; RV64MV-NEXT: lui a5, %hi(.LCPI3_2)
	; RV64MV-NEXT: ld a5, %lo(.LCPI3_2)(a5)			; RV64MV-NEXT: ld a5, %lo(.LCPI3_2)(a5)
	; RV64MV-NEXT: add a2, a4, a2			; RV64MV-NEXT: sub a3, a3, a4
				; RV64MV-NEXT: add a2, a2, a3
				; RV64MV-NEXT: sd a2, 8(sp)
				; RV64MV-NEXT: mulh a2, a1, a5
				; RV64MV-NEXT: srli a3, a2, 63
				; RV64MV-NEXT: srai a2, a2, 1
				; RV64MV-NEXT: add a2, a2, a3
				; RV64MV-NEXT: slli a3, a2, 2
				; RV64MV-NEXT: add a2, a3, a2
	; RV64MV-NEXT: add a1, a1, a2			; RV64MV-NEXT: add a1, a1, a2
	; RV64MV-NEXT: sd a1, 16(sp)			; RV64MV-NEXT: sd a1, 16(sp)
	; RV64MV-NEXT: mulh a1, a3, a5
	; RV64MV-NEXT: srli a2, a1, 63
	; RV64MV-NEXT: srai a1, a1, 1
	; RV64MV-NEXT: add a1, a1, a2
	; RV64MV-NEXT: slli a2, a1, 3
	; RV64MV-NEXT: sub a1, a1, a2
	; RV64MV-NEXT: add a1, a3, a1
	; RV64MV-NEXT: sd a1, 8(sp)
	; RV64MV-NEXT: mv a1, sp			; RV64MV-NEXT: mv a1, sp
	; RV64MV-NEXT: vsetivli zero, 4, e64, m2, ta, ma			; RV64MV-NEXT: vsetivli zero, 4, e64, m2, ta, ma
	; RV64MV-NEXT: vle64.v v8, (a1)			; RV64MV-NEXT: vle64.v v8, (a1)
	; RV64MV-NEXT: lui a1, %hi(.LCPI3_3)			; RV64MV-NEXT: lui a1, %hi(.LCPI3_3)
	; RV64MV-NEXT: addi a1, a1, %lo(.LCPI3_3)			; RV64MV-NEXT: addi a1, a1, %lo(.LCPI3_3)
	; RV64MV-NEXT: vle64.v v10, (a1)			; RV64MV-NEXT: vle64.v v10, (a1)
	; RV64MV-NEXT: li a1, -1			; RV64MV-NEXT: li a1, -1
	; RV64MV-NEXT: srli a1, a1, 31			; RV64MV-NEXT: srli a1, a1, 31
	Show All 34 Lines

llvm/test/CodeGen/RISCV/unaligned-load-store.ll

Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	; MISALIGN-NEXT: ret
ret i24 %res		ret i24 %res
}		}

define i32 @load_i32(i32* %p) {		define i32 @load_i32(i32* %p) {
; RV32I-LABEL: load_i32:		; RV32I-LABEL: load_i32:
; RV32I: # %bb.0:		; RV32I: # %bb.0:
; RV32I-NEXT: lbu a1, 1(a0)		; RV32I-NEXT: lbu a1, 1(a0)
; RV32I-NEXT: lbu a2, 0(a0)		; RV32I-NEXT: lbu a2, 0(a0)
; RV32I-NEXT: lbu a3, 3(a0)		; RV32I-NEXT: lbu a3, 2(a0)
; RV32I-NEXT: lbu a0, 2(a0)		; RV32I-NEXT: lbu a0, 3(a0)
; RV32I-NEXT: slli a1, a1, 8		; RV32I-NEXT: slli a1, a1, 8
; RV32I-NEXT: or a1, a1, a2		; RV32I-NEXT: or a1, a1, a2
; RV32I-NEXT: slli a2, a3, 8		; RV32I-NEXT: slli a2, a3, 16
; RV32I-NEXT: or a0, a2, a0		; RV32I-NEXT: slli a0, a0, 24
; RV32I-NEXT: slli a0, a0, 16		; RV32I-NEXT: or a0, a0, a2
; RV32I-NEXT: or a0, a0, a1		; RV32I-NEXT: or a0, a0, a1
; RV32I-NEXT: ret		; RV32I-NEXT: ret
;		;
; RV64I-LABEL: load_i32:		; RV64I-LABEL: load_i32:
; RV64I: # %bb.0:		; RV64I: # %bb.0:
; RV64I-NEXT: lbu a1, 1(a0)		; RV64I-NEXT: lbu a1, 1(a0)
; RV64I-NEXT: lbu a2, 0(a0)		; RV64I-NEXT: lbu a2, 0(a0)
; RV64I-NEXT: lb a3, 3(a0)		; RV64I-NEXT: lbu a3, 2(a0)
; RV64I-NEXT: lbu a0, 2(a0)		; RV64I-NEXT: lb a0, 3(a0)
; RV64I-NEXT: slli a1, a1, 8		; RV64I-NEXT: slli a1, a1, 8
; RV64I-NEXT: or a1, a1, a2		; RV64I-NEXT: or a1, a1, a2
; RV64I-NEXT: slli a2, a3, 8		; RV64I-NEXT: slli a2, a3, 16
; RV64I-NEXT: or a0, a2, a0		; RV64I-NEXT: slli a0, a0, 24
; RV64I-NEXT: slli a0, a0, 16		; RV64I-NEXT: or a0, a0, a2
; RV64I-NEXT: or a0, a0, a1		; RV64I-NEXT: or a0, a0, a1
; RV64I-NEXT: ret		; RV64I-NEXT: ret
;		;
; MISALIGN-LABEL: load_i32:		; MISALIGN-LABEL: load_i32:
; MISALIGN: # %bb.0:		; MISALIGN: # %bb.0:
; MISALIGN-NEXT: lw a0, 0(a0)		; MISALIGN-NEXT: lw a0, 0(a0)
; MISALIGN-NEXT: ret		; MISALIGN-NEXT: ret
%res = load i32, i32* %p, align 1		%res = load i32, i32* %p, align 1
ret i32 %res		ret i32 %res
}		}

define i64 @load_i64(i64* %p) {		define i64 @load_i64(i64* %p) {
; RV32I-LABEL: load_i64:		; RV32I-LABEL: load_i64:
; RV32I: # %bb.0:		; RV32I: # %bb.0:
; RV32I-NEXT: lbu a1, 1(a0)		; RV32I-NEXT: lbu a1, 1(a0)
; RV32I-NEXT: lbu a2, 0(a0)		; RV32I-NEXT: lbu a2, 0(a0)
; RV32I-NEXT: lbu a3, 3(a0)		; RV32I-NEXT: lbu a3, 2(a0)
; RV32I-NEXT: lbu a4, 2(a0)		; RV32I-NEXT: lbu a4, 3(a0)
; RV32I-NEXT: slli a1, a1, 8		; RV32I-NEXT: slli a1, a1, 8
; RV32I-NEXT: or a1, a1, a2		; RV32I-NEXT: or a1, a1, a2
; RV32I-NEXT: slli a2, a3, 8		; RV32I-NEXT: slli a2, a3, 16
; RV32I-NEXT: or a2, a2, a4		; RV32I-NEXT: slli a3, a4, 24
; RV32I-NEXT: slli a2, a2, 16		; RV32I-NEXT: or a2, a3, a2
; RV32I-NEXT: or a2, a2, a1		; RV32I-NEXT: or a2, a2, a1
; RV32I-NEXT: lbu a1, 5(a0)		; RV32I-NEXT: lbu a1, 5(a0)
; RV32I-NEXT: lbu a3, 4(a0)		; RV32I-NEXT: lbu a3, 4(a0)
; RV32I-NEXT: lbu a4, 7(a0)		; RV32I-NEXT: lbu a4, 6(a0)
; RV32I-NEXT: lbu a0, 6(a0)		; RV32I-NEXT: lbu a0, 7(a0)
; RV32I-NEXT: slli a1, a1, 8		; RV32I-NEXT: slli a1, a1, 8
; RV32I-NEXT: or a1, a1, a3		; RV32I-NEXT: or a1, a1, a3
; RV32I-NEXT: slli a3, a4, 8		; RV32I-NEXT: slli a3, a4, 16
; RV32I-NEXT: or a0, a3, a0		; RV32I-NEXT: slli a0, a0, 24
; RV32I-NEXT: slli a0, a0, 16		; RV32I-NEXT: or a0, a0, a3
; RV32I-NEXT: or a1, a0, a1		; RV32I-NEXT: or a1, a0, a1
; RV32I-NEXT: mv a0, a2		; RV32I-NEXT: mv a0, a2
; RV32I-NEXT: ret		; RV32I-NEXT: ret
;		;
; RV64I-LABEL: load_i64:		; RV64I-LABEL: load_i64:
; RV64I: # %bb.0:		; RV64I: # %bb.0:
; RV64I-NEXT: lbu a1, 1(a0)		; RV64I-NEXT: lbu a1, 1(a0)
; RV64I-NEXT: lbu a2, 0(a0)		; RV64I-NEXT: lbu a2, 0(a0)
; RV64I-NEXT: lbu a3, 3(a0)		; RV64I-NEXT: lbu a3, 2(a0)
; RV64I-NEXT: lbu a4, 2(a0)		; RV64I-NEXT: lbu a4, 3(a0)
; RV64I-NEXT: slli a1, a1, 8		; RV64I-NEXT: slli a1, a1, 8
; RV64I-NEXT: or a1, a1, a2		; RV64I-NEXT: or a1, a1, a2
; RV64I-NEXT: slli a2, a3, 8		; RV64I-NEXT: slli a2, a3, 16
; RV64I-NEXT: or a2, a2, a4		; RV64I-NEXT: slli a3, a4, 24
; RV64I-NEXT: slli a2, a2, 16		; RV64I-NEXT: or a2, a3, a2
; RV64I-NEXT: or a1, a2, a1		; RV64I-NEXT: or a1, a2, a1
; RV64I-NEXT: lbu a2, 5(a0)		; RV64I-NEXT: lbu a2, 5(a0)
; RV64I-NEXT: lbu a3, 4(a0)		; RV64I-NEXT: lbu a3, 4(a0)
; RV64I-NEXT: lbu a4, 7(a0)		; RV64I-NEXT: lbu a4, 6(a0)
; RV64I-NEXT: lbu a0, 6(a0)		; RV64I-NEXT: lbu a0, 7(a0)
; RV64I-NEXT: slli a2, a2, 8		; RV64I-NEXT: slli a2, a2, 8
; RV64I-NEXT: or a2, a2, a3		; RV64I-NEXT: or a2, a2, a3
; RV64I-NEXT: slli a3, a4, 8		; RV64I-NEXT: slli a3, a4, 16
; RV64I-NEXT: or a0, a3, a0		; RV64I-NEXT: slli a0, a0, 24
; RV64I-NEXT: slli a0, a0, 16		; RV64I-NEXT: or a0, a0, a3
; RV64I-NEXT: or a0, a0, a2		; RV64I-NEXT: or a0, a0, a2
; RV64I-NEXT: slli a0, a0, 32		; RV64I-NEXT: slli a0, a0, 32
; RV64I-NEXT: or a0, a0, a1		; RV64I-NEXT: or a0, a0, a1
; RV64I-NEXT: ret		; RV64I-NEXT: ret
;		;
; MISALIGN-RV32I-LABEL: load_i64:		; MISALIGN-RV32I-LABEL: load_i64:
; MISALIGN-RV32I: # %bb.0:		; MISALIGN-RV32I: # %bb.0:
; MISALIGN-RV32I-NEXT: lw a2, 0(a0)		; MISALIGN-RV32I-NEXT: lw a2, 0(a0)
▲ Show 20 Lines • Show All 130 Lines • Show Last 20 Lines

llvm/test/CodeGen/SystemZ/store_nonbytesized_vecs.ll

Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	; CHECK-NEXT: br %r14
%res = bitcast <16 x i1> %src to i16		%res = bitcast <16 x i1> %src to i16
ret i16 %res		ret i16 %res
}		}

; Truncate a <8 x i32> vector to <8 x i31> and store it (test splitting).		; Truncate a <8 x i32> vector to <8 x i31> and store it (test splitting).
define void @fun2(<8 x i32> %src, ptr %p)		define void @fun2(<8 x i32> %src, ptr %p)
; CHECK-LABEL: fun2:		; CHECK-LABEL: fun2:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: stmg %r14, %r15, 112(%r15)
; CHECK-NEXT: .cfi_offset %r14, -48
; CHECK-NEXT: .cfi_offset %r15, -40
; CHECK-NEXT: vlgvf %r1, %v26, 3		; CHECK-NEXT: vlgvf %r1, %v26, 3
; CHECK-NEXT: vlgvf %r0, %v26, 2		; CHECK-NEXT: vlgvf %r5, %v24, 0
		; CHECK-NEXT: vlgvf %r3, %v24, 1
		; CHECK-NEXT: srlk %r0, %r1, 8
		; CHECK-NEXT: sllg %r5, %r5, 33
		; CHECK-NEXT: sth %r0, 28(%r2)
		; CHECK-NEXT: rosbg %r5, %r3, 31, 55, 2
		; CHECK-NEXT: vlgvf %r0, %v24, 2
		; CHECK-NEXT: sllg %r4, %r3, 58
		; CHECK-NEXT: vlgvf %r3, %v26, 2
; CHECK-NEXT: stc %r1, 30(%r2)		; CHECK-NEXT: stc %r1, 30(%r2)
; CHECK-NEXT: srlk %r3, %r1, 8		; CHECK-NEXT: rosbg %r4, %r0, 6, 36, 27
; CHECK-NEXT: risbgn %r1, %r1, 33, 167, 0		; CHECK-NEXT: risbgn %r1, %r1, 33, 167, 0
; CHECK-NEXT: vlgvf %r5, %v24, 2		; CHECK-NEXT: rosbg %r1, %r3, 2, 32, 31
; CHECK-NEXT: rosbg %r1, %r0, 2, 32, 31
; CHECK-NEXT: sth %r3, 28(%r2)
; CHECK-NEXT: srlg %r1, %r1, 24		; CHECK-NEXT: srlg %r1, %r1, 24
; CHECK-NEXT: vlgvf %r3, %v24, 3		; CHECK-NEXT: rosbg %r5, %r4, 56, 63, 8
		; CHECK-NEXT: vlgvf %r4, %v24, 3
; CHECK-NEXT: st %r1, 24(%r2)		; CHECK-NEXT: st %r1, 24(%r2)
; CHECK-NEXT: vlgvf %r1, %v26, 0		; CHECK-NEXT: vlgvf %r1, %v26, 0
; CHECK-NEXT: risbgn %r14, %r5, 6, 164, 27		; CHECK-NEXT: risbgn %r0, %r0, 6, 164, 27
; CHECK-NEXT: sllg %r4, %r3, 60		; CHECK-NEXT: rosbg %r0, %r4, 37, 63, 60
; CHECK-NEXT: rosbg %r14, %r3, 37, 63, 60		; CHECK-NEXT: stg %r5, 0(%r2)
; CHECK-NEXT: sllg %r3, %r14, 8		; CHECK-NEXT: sllg %r5, %r4, 60
; CHECK-NEXT: rosbg %r4, %r1, 4, 34, 29
; CHECK-NEXT: rosbg %r3, %r4, 56, 63, 8
; CHECK-NEXT: stg %r3, 8(%r2)
; CHECK-NEXT: vlgvf %r3, %v24, 1
; CHECK-NEXT: sllg %r4, %r3, 58
; CHECK-NEXT: rosbg %r4, %r5, 6, 36, 27
; CHECK-NEXT: vlgvf %r5, %v24, 0
; CHECK-NEXT: sllg %r5, %r5, 25
; CHECK-NEXT: rosbg %r5, %r3, 39, 63, 58
; CHECK-NEXT: sllg %r3, %r5, 8
; CHECK-NEXT: rosbg %r3, %r4, 56, 63, 8
; CHECK-NEXT: stg %r3, 0(%r2)
; CHECK-NEXT: vlgvf %r3, %v26, 1
; CHECK-NEXT: sllg %r4, %r3, 62
; CHECK-NEXT: rosbg %r4, %r0, 2, 32, 31
; CHECK-NEXT: risbgn %r0, %r1, 4, 162, 29
; CHECK-NEXT: rosbg %r0, %r3, 35, 63, 62
; CHECK-NEXT: sllg %r0, %r0, 8		; CHECK-NEXT: sllg %r0, %r0, 8
		; CHECK-NEXT: rosbg %r5, %r1, 4, 34, 29
		; CHECK-NEXT: risbgn %r1, %r1, 4, 162, 29
		; CHECK-NEXT: rosbg %r0, %r5, 56, 63, 8
		; CHECK-NEXT: stg %r0, 8(%r2)
		; CHECK-NEXT: vlgvf %r0, %v26, 1
		; CHECK-NEXT: sllg %r4, %r0, 62
		; CHECK-NEXT: rosbg %r1, %r0, 35, 63, 62
		; CHECK-NEXT: sllg %r0, %r1, 8
		; CHECK-NEXT: rosbg %r4, %r3, 2, 32, 31
; CHECK-NEXT: rosbg %r0, %r4, 56, 63, 8		; CHECK-NEXT: rosbg %r0, %r4, 56, 63, 8
; CHECK-NEXT: stg %r0, 16(%r2)		; CHECK-NEXT: stg %r0, 16(%r2)
; CHECK-NEXT: lmg %r14, %r15, 112(%r15)
; CHECK-NEXT: br %r14		; CHECK-NEXT: br %r14
{		{
%tmp = trunc <8 x i32> %src to <8 x i31>		%tmp = trunc <8 x i32> %src to <8 x i31>
store <8 x i31> %tmp, ptr %p		store <8 x i31> %tmp, ptr %p
ret void		ret void
}		}

; Load and store a <3 x i31> vector (test widening).		; Load and store a <3 x i31> vector (test widening).
Show All 13 Lines

llvm/test/CodeGen/Thumb/urem-seteq-illegal-types.ll

Show All 25 Lines	; CHECK-NEXT: .long 859308032 @ 0x33380000
ret i1 %cmp		ret i1 %cmp
}		}

define i1 @test_urem_even(i27 %X) nounwind {		define i1 @test_urem_even(i27 %X) nounwind {
; CHECK-LABEL: test_urem_even:		; CHECK-LABEL: test_urem_even:
; CHECK: @ %bb.0:		; CHECK: @ %bb.0:
; CHECK-NEXT: ldr r1, .LCPI1_0		; CHECK-NEXT: ldr r1, .LCPI1_0
; CHECK-NEXT: muls r1, r0, r1		; CHECK-NEXT: muls r1, r0, r1
; CHECK-NEXT: lsls r0, r1, #26		; CHECK-NEXT: lsls r0, r1, #31
; CHECK-NEXT: ldr r2, .LCPI1_1		; CHECK-NEXT: ldr r2, .LCPI1_1
; CHECK-NEXT: ands r2, r1		; CHECK-NEXT: ands r2, r1
; CHECK-NEXT: lsrs r1, r2, #1		; CHECK-NEXT: lsrs r1, r2, #1
; CHECK-NEXT: adds r0, r1, r0		; CHECK-NEXT: lsls r1, r1, #5
; CHECK-NEXT: lsls r0, r0, #5		; CHECK-NEXT: adds r0, r0, r1
; CHECK-NEXT: ldr r1, .LCPI1_2		; CHECK-NEXT: ldr r1, .LCPI1_2
; CHECK-NEXT: cmp r0, r1		; CHECK-NEXT: cmp r0, r1
; CHECK-NEXT: blo .LBB1_2		; CHECK-NEXT: blo .LBB1_2
; CHECK-NEXT: @ %bb.1:		; CHECK-NEXT: @ %bb.1:
; CHECK-NEXT: movs r0, #0		; CHECK-NEXT: movs r0, #0
; CHECK-NEXT: bx lr		; CHECK-NEXT: bx lr
; CHECK-NEXT: .LBB1_2:		; CHECK-NEXT: .LBB1_2:
; CHECK-NEXT: movs r0, #1		; CHECK-NEXT: movs r0, #1
▲ Show 20 Lines • Show All 127 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/bool-vector.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=-sse2 \| FileCheck %s --check-prefix=X86			; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=-sse2 \| FileCheck %s --check-prefix=X86
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2 \| FileCheck %s --check-prefix=X64			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2 \| FileCheck %s --check-prefix=X64
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=SSE2			; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=SSE2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=SSE2			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=SSE2
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+avx2 \| FileCheck %s --check-prefix=AVX2			; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+avx2 \| FileCheck %s --check-prefix=AVX2
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx2 \| FileCheck %s --check-prefix=AVX2			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx2 \| FileCheck %s --check-prefix=AVX2

	define i32 @PR15215_bad(<4 x i32> %input) {			define i32 @PR15215_bad(<4 x i32> %input) {
	; X86-LABEL: PR15215_bad:			; X86-LABEL: PR15215_bad:
	; X86: # %bb.0: # %entry			; X86: # %bb.0: # %entry
	; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzbl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movzbl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movb {{[0-9]+}}(%esp), %ah			; X86-NEXT: movb {{[0-9]+}}(%esp), %ah
	; X86-NEXT: addb %ah, %ah			; X86-NEXT: shlb $3, %ah
	; X86-NEXT: andb $1, %cl			; X86-NEXT: andb $1, %cl
	; X86-NEXT: orb %ah, %cl
	; X86-NEXT: shlb $2, %cl			; X86-NEXT: shlb $2, %cl
				; X86-NEXT: orb %ah, %cl
	; X86-NEXT: addb %dl, %dl			; X86-NEXT: addb %dl, %dl
	; X86-NEXT: andb $1, %al			; X86-NEXT: andb $1, %al
	; X86-NEXT: orb %dl, %al			; X86-NEXT: orb %dl, %al
	; X86-NEXT: andb $3, %al			; X86-NEXT: andb $3, %al
	; X86-NEXT: orb %cl, %al			; X86-NEXT: orb %cl, %al
	; X86-NEXT: movzbl %al, %eax			; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: andl $15, %eax			; X86-NEXT: andl $15, %eax
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: PR15215_bad:			; X64-LABEL: PR15215_bad:
	; X64: # %bb.0: # %entry			; X64: # %bb.0: # %entry
	; X64-NEXT: addb %cl, %cl			; X64-NEXT: shlb $3, %cl
	; X64-NEXT: andb $1, %dl			; X64-NEXT: andb $1, %dl
	; X64-NEXT: orb %cl, %dl
	; X64-NEXT: shlb $2, %dl			; X64-NEXT: shlb $2, %dl
				; X64-NEXT: orb %cl, %dl
	; X64-NEXT: addb %sil, %sil			; X64-NEXT: addb %sil, %sil
	; X64-NEXT: andb $1, %dil			; X64-NEXT: andb $1, %dil
	; X64-NEXT: orb %sil, %dil			; X64-NEXT: orb %sil, %dil
	; X64-NEXT: andb $3, %dil			; X64-NEXT: andb $3, %dil
	; X64-NEXT: orb %dl, %dil			; X64-NEXT: orb %dl, %dil
	; X64-NEXT: movzbl %dil, %eax			; X64-NEXT: movzbl %dil, %eax
	; X64-NEXT: andl $15, %eax			; X64-NEXT: andl $15, %eax
	; X64-NEXT: retq			; X64-NEXT: retq
	▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/combine-bitreverse.ll

	Show First 20 Lines • Show All 227 Lines • ▼ Show 20 Lines
	; X86-NEXT: shll $4, %ecx			; X86-NEXT: shll $4, %ecx
	; X86-NEXT: shrl $4, %eax			; X86-NEXT: shrl $4, %eax
	; X86-NEXT: andl $252645135, %eax # imm = 0xF0F0F0F			; X86-NEXT: andl $252645135, %eax # imm = 0xF0F0F0F
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $858993459, %ecx # imm = 0x33333333			; X86-NEXT: andl $858993459, %ecx # imm = 0x33333333
	; X86-NEXT: shrl $2, %eax			; X86-NEXT: shrl $2, %eax
	; X86-NEXT: andl $858993459, %eax # imm = 0x33333333			; X86-NEXT: andl $858993459, %eax # imm = 0x33333333
	; X86-NEXT: leal (%eax,%ecx,4), %eax			; X86-NEXT: leal (%eax,%ecx,4), %ecx
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %ecx, %eax
	; X86-NEXT: andl $5592405, %ecx # imm = 0x555555			; X86-NEXT: andl $5592405, %eax # imm = 0x555555
	; X86-NEXT: shrl %eax			; X86-NEXT: shll $6, %ecx
	; X86-NEXT: andl $22369621, %eax # imm = 0x1555555			; X86-NEXT: andl $-1431655808, %ecx # imm = 0xAAAAAA80
	; X86-NEXT: leal (%eax,%ecx,2), %eax			; X86-NEXT: shll $8, %eax
	; X86-NEXT: shll $7, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: bswapl %eax			; X86-NEXT: bswapl %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $986895, %ecx # imm = 0xF0F0F			; X86-NEXT: andl $986895, %ecx # imm = 0xF0F0F
	; X86-NEXT: shll $4, %ecx			; X86-NEXT: shll $4, %ecx
	; X86-NEXT: shrl $4, %eax			; X86-NEXT: shrl $4, %eax
	; X86-NEXT: andl $135204623, %eax # imm = 0x80F0F0F			; X86-NEXT: andl $135204623, %eax # imm = 0x80F0F0F
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	Show All 20 Lines
	; X64-NEXT: orl %eax, %edi			; X64-NEXT: orl %eax, %edi
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: andl $858993459, %eax # imm = 0x33333333			; X64-NEXT: andl $858993459, %eax # imm = 0x33333333
	; X64-NEXT: shrl $2, %edi			; X64-NEXT: shrl $2, %edi
	; X64-NEXT: andl $858993459, %edi # imm = 0x33333333			; X64-NEXT: andl $858993459, %edi # imm = 0x33333333
	; X64-NEXT: leal (%rdi,%rax,4), %eax			; X64-NEXT: leal (%rdi,%rax,4), %eax
	; X64-NEXT: movl %eax, %ecx			; X64-NEXT: movl %eax, %ecx
	; X64-NEXT: andl $5592405, %ecx # imm = 0x555555			; X64-NEXT: andl $5592405, %ecx # imm = 0x555555
	; X64-NEXT: shrl %eax			; X64-NEXT: shll $6, %eax
	; X64-NEXT: andl $22369621, %eax # imm = 0x1555555			; X64-NEXT: andl $-1431655808, %eax # imm = 0xAAAAAA80
	; X64-NEXT: leal (%rax,%rcx,2), %eax			; X64-NEXT: shll $8, %ecx
	; X64-NEXT: shll $7, %eax			; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: bswapl %eax			; X64-NEXT: bswapl %ecx
	; X64-NEXT: movl %eax, %ecx			; X64-NEXT: movl %ecx, %eax
	; X64-NEXT: andl $986895, %ecx # imm = 0xF0F0F			; X64-NEXT: andl $986895, %eax # imm = 0xF0F0F
	; X64-NEXT: shll $4, %ecx			; X64-NEXT: shll $4, %eax
	; X64-NEXT: shrl $4, %eax			; X64-NEXT: shrl $4, %ecx
	; X64-NEXT: andl $135204623, %eax # imm = 0x80F0F0F			; X64-NEXT: andl $135204623, %ecx # imm = 0x80F0F0F
	; X64-NEXT: orl %ecx, %eax			; X64-NEXT: orl %eax, %ecx
	; X64-NEXT: movl %eax, %ecx			; X64-NEXT: movl %ecx, %eax
	; X64-NEXT: andl $3355443, %ecx # imm = 0x333333			; X64-NEXT: andl $3355443, %eax # imm = 0x333333
	; X64-NEXT: shrl $2, %eax			; X64-NEXT: shrl $2, %ecx
	; X64-NEXT: andl $36909875, %eax # imm = 0x2333333			; X64-NEXT: andl $36909875, %ecx # imm = 0x2333333
	; X64-NEXT: leal (%rax,%rcx,4), %eax			; X64-NEXT: leal (%rcx,%rax,4), %eax
	; X64-NEXT: movl %eax, %ecx			; X64-NEXT: movl %eax, %ecx
	; X64-NEXT: andl $1431655765, %ecx # imm = 0x55555555			; X64-NEXT: andl $1431655765, %ecx # imm = 0x55555555
	; X64-NEXT: shrl %eax			; X64-NEXT: shrl %eax
	; X64-NEXT: andl $1431655765, %eax # imm = 0x55555555			; X64-NEXT: andl $1431655765, %eax # imm = 0x55555555
	; X64-NEXT: leal (%rax,%rcx,2), %eax			; X64-NEXT: leal (%rax,%rcx,2), %eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%b = call i32 @llvm.bitreverse.i32(i32 %a0)			%b = call i32 @llvm.bitreverse.i32(i32 %a0)
	%c = shl i32 %b, 7			%c = shl i32 %b, 7
	Show All 14 Lines
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $858993459, %ecx # imm = 0x33333333			; X86-NEXT: andl $858993459, %ecx # imm = 0x33333333
	; X86-NEXT: shrl $2, %eax			; X86-NEXT: shrl $2, %eax
	; X86-NEXT: andl $858993459, %eax # imm = 0x33333333			; X86-NEXT: andl $858993459, %eax # imm = 0x33333333
	; X86-NEXT: leal (%eax,%ecx,4), %eax			; X86-NEXT: leal (%eax,%ecx,4), %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $357913941, %ecx # imm = 0x15555555			; X86-NEXT: andl $357913941, %ecx # imm = 0x15555555
	; X86-NEXT: shrl %eax			; X86-NEXT: andl $-1431655766, %eax # imm = 0xAAAAAAAA
	; X86-NEXT: andl $1431655765, %eax # imm = 0x55555555			; X86-NEXT: leal (%eax,%ecx,4), %eax
	; X86-NEXT: leal (%eax,%ecx,2), %eax
	; X86-NEXT: addl %eax, %eax
	; X86-NEXT: bswapl %eax			; X86-NEXT: bswapl %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $235867919, %ecx # imm = 0xE0F0F0F			; X86-NEXT: andl $235867919, %ecx # imm = 0xE0F0F0F
	; X86-NEXT: shll $4, %ecx			; X86-NEXT: shll $4, %ecx
	; X86-NEXT: shrl $4, %eax			; X86-NEXT: shrl $4, %eax
	; X86-NEXT: andl $252645135, %eax # imm = 0xF0F0F0F			; X86-NEXT: andl $252645135, %eax # imm = 0xF0F0F0F
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/is_fpclass.ll

	Show First 20 Lines • Show All 851 Lines • ▼ Show 20 Lines
	; CHECK-32-NEXT: flds {{[0-9]+}}(%esp)			; CHECK-32-NEXT: flds {{[0-9]+}}(%esp)
	; CHECK-32-NEXT: flds {{[0-9]+}}(%esp)			; CHECK-32-NEXT: flds {{[0-9]+}}(%esp)
	; CHECK-32-NEXT: flds {{[0-9]+}}(%esp)			; CHECK-32-NEXT: flds {{[0-9]+}}(%esp)
	; CHECK-32-NEXT: fucomp %st(0)			; CHECK-32-NEXT: fucomp %st(0)
	; CHECK-32-NEXT: fnstsw %ax			; CHECK-32-NEXT: fnstsw %ax
	; CHECK-32-NEXT: # kill: def $ah killed $ah killed $ax			; CHECK-32-NEXT: # kill: def $ah killed $ah killed $ax
	; CHECK-32-NEXT: sahf			; CHECK-32-NEXT: sahf
	; CHECK-32-NEXT: setp %dh			; CHECK-32-NEXT: setp %dh
				; CHECK-32-NEXT: shlb $2, %dh
	; CHECK-32-NEXT: fucomp %st(0)			; CHECK-32-NEXT: fucomp %st(0)
	; CHECK-32-NEXT: fnstsw %ax			; CHECK-32-NEXT: fnstsw %ax
	; CHECK-32-NEXT: # kill: def $ah killed $ah killed $ax			; CHECK-32-NEXT: # kill: def $ah killed $ah killed $ax
	; CHECK-32-NEXT: sahf			; CHECK-32-NEXT: sahf
	; CHECK-32-NEXT: setp %dl			; CHECK-32-NEXT: setp %dl
	; CHECK-32-NEXT: addb %dl, %dl			; CHECK-32-NEXT: shlb $3, %dl
	; CHECK-32-NEXT: orb %dh, %dl			; CHECK-32-NEXT: orb %dh, %dl
	; CHECK-32-NEXT: fucomp %st(0)			; CHECK-32-NEXT: fucomp %st(0)
	; CHECK-32-NEXT: fnstsw %ax			; CHECK-32-NEXT: fnstsw %ax
	; CHECK-32-NEXT: # kill: def $ah killed $ah killed $ax			; CHECK-32-NEXT: # kill: def $ah killed $ah killed $ax
	; CHECK-32-NEXT: sahf			; CHECK-32-NEXT: sahf
	; CHECK-32-NEXT: setp %dh			; CHECK-32-NEXT: setp %dh
	; CHECK-32-NEXT: fucomp %st(0)			; CHECK-32-NEXT: fucomp %st(0)
	; CHECK-32-NEXT: fnstsw %ax			; CHECK-32-NEXT: fnstsw %ax
	; CHECK-32-NEXT: # kill: def $ah killed $ah killed $ax			; CHECK-32-NEXT: # kill: def $ah killed $ah killed $ax
	; CHECK-32-NEXT: sahf			; CHECK-32-NEXT: sahf
	; CHECK-32-NEXT: setp %al			; CHECK-32-NEXT: setp %al
	; CHECK-32-NEXT: addb %al, %al			; CHECK-32-NEXT: addb %al, %al
	; CHECK-32-NEXT: orb %dh, %al			; CHECK-32-NEXT: orb %dh, %al
	; CHECK-32-NEXT: shlb $2, %al
	; CHECK-32-NEXT: orb %dl, %al			; CHECK-32-NEXT: orb %dl, %al
	; CHECK-32-NEXT: movb %al, (%ecx)			; CHECK-32-NEXT: movb %al, (%ecx)
	; CHECK-32-NEXT: movl %ecx, %eax			; CHECK-32-NEXT: movl %ecx, %eax
	; CHECK-32-NEXT: retl $4			; CHECK-32-NEXT: retl $4
	;			;
	; CHECK-64-LABEL: isnan_v4f:			; CHECK-64-LABEL: isnan_v4f:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: cmpunordps %xmm0, %xmm0			; CHECK-64-NEXT: cmpunordps %xmm0, %xmm0
	Show All 10 Lines
	; CHECK-32-NEXT: .cfi_def_cfa_offset 8			; CHECK-32-NEXT: .cfi_def_cfa_offset 8
	; CHECK-32-NEXT: .cfi_offset %esi, -8			; CHECK-32-NEXT: .cfi_offset %esi, -8
	; CHECK-32-NEXT: movl {{[0-9]+}}(%esp), %eax			; CHECK-32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; CHECK-32-NEXT: movl $2147483647, %ecx # imm = 0x7FFFFFFF			; CHECK-32-NEXT: movl $2147483647, %ecx # imm = 0x7FFFFFFF
	; CHECK-32-NEXT: movl {{[0-9]+}}(%esp), %edx			; CHECK-32-NEXT: movl {{[0-9]+}}(%esp), %edx
	; CHECK-32-NEXT: andl %ecx, %edx			; CHECK-32-NEXT: andl %ecx, %edx
	; CHECK-32-NEXT: cmpl $2139095041, %edx # imm = 0x7F800001			; CHECK-32-NEXT: cmpl $2139095041, %edx # imm = 0x7F800001
	; CHECK-32-NEXT: setge %dh			; CHECK-32-NEXT: setge %dh
				; CHECK-32-NEXT: shlb $2, %dh
	; CHECK-32-NEXT: movl {{[0-9]+}}(%esp), %esi			; CHECK-32-NEXT: movl {{[0-9]+}}(%esp), %esi
	; CHECK-32-NEXT: andl %ecx, %esi			; CHECK-32-NEXT: andl %ecx, %esi
	; CHECK-32-NEXT: cmpl $2139095041, %esi # imm = 0x7F800001			; CHECK-32-NEXT: cmpl $2139095041, %esi # imm = 0x7F800001
	; CHECK-32-NEXT: setge %dl			; CHECK-32-NEXT: setge %dl
	; CHECK-32-NEXT: addb %dl, %dl			; CHECK-32-NEXT: shlb $3, %dl
	; CHECK-32-NEXT: orb %dh, %dl			; CHECK-32-NEXT: orb %dh, %dl
	; CHECK-32-NEXT: movl {{[0-9]+}}(%esp), %esi			; CHECK-32-NEXT: movl {{[0-9]+}}(%esp), %esi
	; CHECK-32-NEXT: andl %ecx, %esi			; CHECK-32-NEXT: andl %ecx, %esi
	; CHECK-32-NEXT: cmpl $2139095041, %esi # imm = 0x7F800001			; CHECK-32-NEXT: cmpl $2139095041, %esi # imm = 0x7F800001
	; CHECK-32-NEXT: setge %dh			; CHECK-32-NEXT: setge %dh
	; CHECK-32-NEXT: andl {{[0-9]+}}(%esp), %ecx			; CHECK-32-NEXT: andl {{[0-9]+}}(%esp), %ecx
	; CHECK-32-NEXT: cmpl $2139095041, %ecx # imm = 0x7F800001			; CHECK-32-NEXT: cmpl $2139095041, %ecx # imm = 0x7F800001
	; CHECK-32-NEXT: setge %cl			; CHECK-32-NEXT: setge %cl
	; CHECK-32-NEXT: addb %cl, %cl			; CHECK-32-NEXT: addb %cl, %cl
	; CHECK-32-NEXT: orb %dh, %cl			; CHECK-32-NEXT: orb %dh, %cl
	; CHECK-32-NEXT: shlb $2, %cl
	; CHECK-32-NEXT: orb %dl, %cl			; CHECK-32-NEXT: orb %dl, %cl
	; CHECK-32-NEXT: movb %cl, (%eax)			; CHECK-32-NEXT: movb %cl, (%eax)
	; CHECK-32-NEXT: popl %esi			; CHECK-32-NEXT: popl %esi
	; CHECK-32-NEXT: .cfi_def_cfa_offset 4			; CHECK-32-NEXT: .cfi_def_cfa_offset 4
	; CHECK-32-NEXT: retl $4			; CHECK-32-NEXT: retl $4
	;			;
	; CHECK-64-LABEL: isnan_v4f_strictfp:			; CHECK-64-LABEL: isnan_v4f_strictfp:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/vector-sext.ll

	Show First 20 Lines • Show All 3,513 Lines • ▼ Show 20 Lines
	; SSE2-NEXT: movd %ecx, %xmm0			; SSE2-NEXT: movd %ecx, %xmm0
	; SSE2-NEXT: movq %rax, %rcx			; SSE2-NEXT: movq %rax, %rcx
	; SSE2-NEXT: shrq $17, %rcx			; SSE2-NEXT: shrq $17, %rcx
	; SSE2-NEXT: shll $15, %ecx			; SSE2-NEXT: shll $15, %ecx
	; SSE2-NEXT: sarl $15, %ecx			; SSE2-NEXT: sarl $15, %ecx
	; SSE2-NEXT: movd %ecx, %xmm1			; SSE2-NEXT: movd %ecx, %xmm1
	; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; SSE2-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; SSE2-NEXT: movl 8(%rdi), %ecx			; SSE2-NEXT: movl 8(%rdi), %ecx
	; SSE2-NEXT: shll $13, %ecx			; SSE2-NEXT: shll $28, %ecx
	; SSE2-NEXT: movq %rax, %rdx			; SSE2-NEXT: movq %rax, %rdx
	; SSE2-NEXT: shrq $51, %rdx			; SSE2-NEXT: shrq $51, %rdx
	; SSE2-NEXT: orl %ecx, %edx
	; SSE2-NEXT: shll $15, %edx			; SSE2-NEXT: shll $15, %edx
				; SSE2-NEXT: orl %ecx, %edx
	; SSE2-NEXT: sarl $15, %edx			; SSE2-NEXT: sarl $15, %edx
	; SSE2-NEXT: movd %edx, %xmm1			; SSE2-NEXT: movd %edx, %xmm1
	; SSE2-NEXT: shrq $34, %rax			; SSE2-NEXT: shrq $34, %rax
	; SSE2-NEXT: shll $15, %eax			; SSE2-NEXT: shll $15, %eax
	; SSE2-NEXT: sarl $15, %eax			; SSE2-NEXT: sarl $15, %eax
	; SSE2-NEXT: movd %eax, %xmm2			; SSE2-NEXT: movd %eax, %xmm2
	; SSE2-NEXT: punpckldq {{.*#+}} xmm2 = xmm2[0],xmm1[0],xmm2[1],xmm1[1]			; SSE2-NEXT: punpckldq {{.*#+}} xmm2 = xmm2[0],xmm1[0],xmm2[1],xmm1[1]
	; SSE2-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0]			; SSE2-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0]
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSSE3-LABEL: sext_4i17_to_4i32:			; SSSE3-LABEL: sext_4i17_to_4i32:
	; SSSE3: # %bb.0:			; SSSE3: # %bb.0:
	; SSSE3-NEXT: movq (%rdi), %rax			; SSSE3-NEXT: movq (%rdi), %rax
	; SSSE3-NEXT: movl %eax, %ecx			; SSSE3-NEXT: movl %eax, %ecx
	; SSSE3-NEXT: shll $15, %ecx			; SSSE3-NEXT: shll $15, %ecx
	; SSSE3-NEXT: sarl $15, %ecx			; SSSE3-NEXT: sarl $15, %ecx
	; SSSE3-NEXT: movd %ecx, %xmm0			; SSSE3-NEXT: movd %ecx, %xmm0
	; SSSE3-NEXT: movq %rax, %rcx			; SSSE3-NEXT: movq %rax, %rcx
	; SSSE3-NEXT: shrq $17, %rcx			; SSSE3-NEXT: shrq $17, %rcx
	; SSSE3-NEXT: shll $15, %ecx			; SSSE3-NEXT: shll $15, %ecx
	; SSSE3-NEXT: sarl $15, %ecx			; SSSE3-NEXT: sarl $15, %ecx
	; SSSE3-NEXT: movd %ecx, %xmm1			; SSSE3-NEXT: movd %ecx, %xmm1
	; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; SSSE3-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; SSSE3-NEXT: movl 8(%rdi), %ecx			; SSSE3-NEXT: movl 8(%rdi), %ecx
	; SSSE3-NEXT: shll $13, %ecx			; SSSE3-NEXT: shll $28, %ecx
	; SSSE3-NEXT: movq %rax, %rdx			; SSSE3-NEXT: movq %rax, %rdx
	; SSSE3-NEXT: shrq $51, %rdx			; SSSE3-NEXT: shrq $51, %rdx
	; SSSE3-NEXT: orl %ecx, %edx
	; SSSE3-NEXT: shll $15, %edx			; SSSE3-NEXT: shll $15, %edx
				; SSSE3-NEXT: orl %ecx, %edx
	; SSSE3-NEXT: sarl $15, %edx			; SSSE3-NEXT: sarl $15, %edx
	; SSSE3-NEXT: movd %edx, %xmm1			; SSSE3-NEXT: movd %edx, %xmm1
	; SSSE3-NEXT: shrq $34, %rax			; SSSE3-NEXT: shrq $34, %rax
	; SSSE3-NEXT: shll $15, %eax			; SSSE3-NEXT: shll $15, %eax
	; SSSE3-NEXT: sarl $15, %eax			; SSSE3-NEXT: sarl $15, %eax
	; SSSE3-NEXT: movd %eax, %xmm2			; SSSE3-NEXT: movd %eax, %xmm2
	; SSSE3-NEXT: punpckldq {{.*#+}} xmm2 = xmm2[0],xmm1[0],xmm2[1],xmm1[1]			; SSSE3-NEXT: punpckldq {{.*#+}} xmm2 = xmm2[0],xmm1[0],xmm2[1],xmm1[1]
	; SSSE3-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0]			; SSSE3-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm2[0]
	Show All 12 Lines
	; SSE41-NEXT: movd %edx, %xmm0			; SSE41-NEXT: movd %edx, %xmm0
	; SSE41-NEXT: pinsrd $1, %ecx, %xmm0			; SSE41-NEXT: pinsrd $1, %ecx, %xmm0
	; SSE41-NEXT: movq %rax, %rcx			; SSE41-NEXT: movq %rax, %rcx
	; SSE41-NEXT: shrq $34, %rcx			; SSE41-NEXT: shrq $34, %rcx
	; SSE41-NEXT: shll $15, %ecx			; SSE41-NEXT: shll $15, %ecx
	; SSE41-NEXT: sarl $15, %ecx			; SSE41-NEXT: sarl $15, %ecx
	; SSE41-NEXT: pinsrd $2, %ecx, %xmm0			; SSE41-NEXT: pinsrd $2, %ecx, %xmm0
	; SSE41-NEXT: movl 8(%rdi), %ecx			; SSE41-NEXT: movl 8(%rdi), %ecx
	; SSE41-NEXT: shll $13, %ecx			; SSE41-NEXT: shll $28, %ecx
	; SSE41-NEXT: shrq $51, %rax			; SSE41-NEXT: shrq $51, %rax
	; SSE41-NEXT: orl %ecx, %eax
	; SSE41-NEXT: shll $15, %eax			; SSE41-NEXT: shll $15, %eax
				; SSE41-NEXT: orl %ecx, %eax
	; SSE41-NEXT: sarl $15, %eax			; SSE41-NEXT: sarl $15, %eax
	; SSE41-NEXT: pinsrd $3, %eax, %xmm0			; SSE41-NEXT: pinsrd $3, %eax, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: sext_4i17_to_4i32:			; AVX-LABEL: sext_4i17_to_4i32:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: movq (%rdi), %rax			; AVX-NEXT: movq (%rdi), %rax
	; AVX-NEXT: movq %rax, %rcx			; AVX-NEXT: movq %rax, %rcx
	; AVX-NEXT: shrq $17, %rcx			; AVX-NEXT: shrq $17, %rcx
	; AVX-NEXT: shll $15, %ecx			; AVX-NEXT: shll $15, %ecx
	; AVX-NEXT: sarl $15, %ecx			; AVX-NEXT: sarl $15, %ecx
	; AVX-NEXT: movl %eax, %edx			; AVX-NEXT: movl %eax, %edx
	; AVX-NEXT: shll $15, %edx			; AVX-NEXT: shll $15, %edx
	; AVX-NEXT: sarl $15, %edx			; AVX-NEXT: sarl $15, %edx
	; AVX-NEXT: vmovd %edx, %xmm0			; AVX-NEXT: vmovd %edx, %xmm0
	; AVX-NEXT: vpinsrd $1, %ecx, %xmm0, %xmm0			; AVX-NEXT: vpinsrd $1, %ecx, %xmm0, %xmm0
	; AVX-NEXT: movq %rax, %rcx			; AVX-NEXT: movq %rax, %rcx
	; AVX-NEXT: shrq $34, %rcx			; AVX-NEXT: shrq $34, %rcx
	; AVX-NEXT: shll $15, %ecx			; AVX-NEXT: shll $15, %ecx
	; AVX-NEXT: sarl $15, %ecx			; AVX-NEXT: sarl $15, %ecx
	; AVX-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0			; AVX-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0
	; AVX-NEXT: movl 8(%rdi), %ecx			; AVX-NEXT: movl 8(%rdi), %ecx
	; AVX-NEXT: shll $13, %ecx			; AVX-NEXT: shll $28, %ecx
	; AVX-NEXT: shrq $51, %rax			; AVX-NEXT: shrq $51, %rax
	; AVX-NEXT: orl %ecx, %eax
	; AVX-NEXT: shll $15, %eax			; AVX-NEXT: shll $15, %eax
				; AVX-NEXT: orl %ecx, %eax
	; AVX-NEXT: sarl $15, %eax			; AVX-NEXT: sarl $15, %eax
	; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0			; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; X86-SSE2-LABEL: sext_4i17_to_4i32:			; X86-SSE2-LABEL: sext_4i17_to_4i32:
	; X86-SSE2: # %bb.0:			; X86-SSE2: # %bb.0:
	; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-SSE2-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-SSE2-NEXT: movl (%edx), %ecx			; X86-SSE2-NEXT: movl (%edx), %ecx
	▲ Show 20 Lines • Show All 484 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DAG] Enable combineShiftOfShiftedLogic folds after type legalizationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 468922

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll

llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll

llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll

llvm/test/CodeGen/AMDGPU/idot8s.ll

llvm/test/CodeGen/AMDGPU/idot8u.ll

llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll

llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll

llvm/test/CodeGen/BPF/pr57872.ll

llvm/test/CodeGen/Mips/cconv/return-struct.ll

llvm/test/CodeGen/Mips/cconv/vector.ll

llvm/test/CodeGen/Mips/load-store-left-right.ll

llvm/test/CodeGen/Mips/unalignedload.ll

llvm/test/CodeGen/RISCV/bswap-bitreverse.ll

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-unaligned.ll

llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll

llvm/test/CodeGen/RISCV/unaligned-load-store.ll

llvm/test/CodeGen/SystemZ/store_nonbytesized_vecs.ll

llvm/test/CodeGen/Thumb/urem-seteq-illegal-types.ll

llvm/test/CodeGen/X86/bool-vector.ll

llvm/test/CodeGen/X86/combine-bitreverse.ll

llvm/test/CodeGen/X86/is_fpclass.ll

llvm/test/CodeGen/X86/vector-sext.ll

[DAG] Enable combineShiftOfShiftedLogic folds after type legalization
ClosedPublic