This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Translate shufflevector
ClosedPublic

Authored by volkan on Mar 14 2017, 5:17 PM.

Download Raw Diff

Details

Reviewers

qcolombet
dsanders
t.p.northover
ab
javed.absar
aditya_nandakumar

Commits

rG75bdc7690e61: [GlobalISel] Translate shufflevector
rL298347: [GlobalISel] Translate shufflevector

Diff Detail

Event Timeline

volkan created this revision.Mar 14 2017, 5:17 PM

Herald added a reviewer: javed.absar. · View Herald TranscriptMar 14 2017, 5:17 PM

Herald added subscribers: kristof.beyls, rovka, dberris. · View Herald Transcript

Thanks Volkan.
Adding a zip to the test above might make the testing stronger, i.e something like shufflevector <8 x i8> %tmp1, <8 x i8> %tmp2, <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15>

Also, would you consider adding a test for arm (32-bit) as well (need not be a big one).
Best Regards.

Added more tests.

Thanks Volkan. LGTM.

This revision is now accepted and ready to land.Mar 18 2017, 9:09 AM

We don't need to do this now, but should we encode the mask in the G_SHUFFLE_VECTOR itself? Either as multiple Imm MOs, or a single new 'VectorMask' operand of some sort?

test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
1423	Should we turn this into an extractelt in the IRTranslator? Seems like having vector ops 0,1,2 is a nice invariant. (which would be nice to check in the verifier, or even in the MIR parser, if we ever add type constraints in GenericOpcodes.td)

In D30962#704916, @ab wrote:

We don't need to do this now, but should we encode the mask in the G_SHUFFLE_VECTOR itself? Either as multiple Imm MOs, or a single new 'VectorMask' operand of some sort?

I think we should probably stick with a single operand to represent the mask since some targets can use vector registers to specify the shuffle mask (e.g. VSHF.df on Mips), but I do agree that a single operand containing a whole constant mask would be nicer to match than the current G_MERGE_VALUES of G_CONSTANTS since there's lots of ways of writing the same constant right now. For example:

(G_MERGE_VALUES (G_CONSTANT i32 0), (G_CONSTANT i32 1), (G_CONSTANT i32 2), (G_CONSTANT i32 3))

and:

(G_MERGE_VALUES (G_MERGE_VALUES (G_CONSTANT i32 0), (G_CONSTANT i32 1)), (G_MERGE_VALUES (G_CONSTANT i32 2), (G_CONSTANT i32 3)))

are the same. The snag is that we would need to limit any flattening to the supportable masks (or be able to revert the transformation).

While I'm thinking about legalizing shuffles, it would be nice in some ways if LegalizeAction were an object so we could write something like:

extern bool isReversedElements(const MachineOperand &);
extern bool isInterleavedElements(const MachineOperand &);

for (const auto &Ty : {v2s64, v4s32, v8s16, v16s8}) {
  setAction({G_SHUFFLE_VECTOR, Ty}, ContentSensitiveLegalizer(/* Default case */ Lower));
  auto &Legalizer = getAction({G_SHUFFLE_VECTOR, Ty});
  if (hasFoo()) {
    Legalizer.addCase(isReversedElements(), Legal);
    Legalizer.addCase(isInterleavedElements(), Legal);
  }
  if (hasBar()) {
    Legalizer.addCase([](const MachineOperand &Op) { ... each 4-elements sub-sequence only shuffles within itself ... }, Legal);
    // Register-based shuffles handle everything.
    Legalizer.setDefaultCase(Legal);
  }
}

We can achieve the same effect with Custom but it seems nice to keep the high-level conditions together in AArch64LegalizerInfo::AArch64LegalizerInfo() instead of distributing them across various callbacks and re-checking them for every MachineInstr.

In D30962#705203, @dsanders wrote:

In D30962#704916, @ab wrote:

We don't need to do this now, but should we encode the mask in the G_SHUFFLE_VECTOR itself? Either as multiple Imm MOs, or a single new 'VectorMask' operand of some sort?

I think we should probably stick with a single operand to represent the mask since some targets can use vector registers to specify the shuffle mask (e.g. VSHF.df on Mips), but I do agree that a single operand containing a whole constant mask would be nicer to match than the current G_MERGE_VALUES of G_CONSTANTS since there's lots of ways of writing the same constant right now. For example:

I agree, a single operand would be better. In this way, we can simplify constant vectors as well and this problem would be fixed automatically as the mask is a constant vector.

(G_MERGE_VALUES (G_CONSTANT i32 0), (G_CONSTANT i32 1), (G_CONSTANT i32 2), (G_CONSTANT i32 3))
and:
(G_MERGE_VALUES (G_MERGE_VALUES (G_CONSTANT i32 0), (G_CONSTANT i32 1)), (G_MERGE_VALUES (G_CONSTANT i32 2), (G_CONSTANT i32 3)))
are the same. The snag is that we would need to limit any flattening to the supportable masks (or be able to revert the transformation).

While I'm thinking about legalizing shuffles, it would be nice in some ways if LegalizeAction were an object so we could write something like:
extern bool isReversedElements(const MachineOperand &);
extern bool isInterleavedElements(const MachineOperand &);

for (const auto &Ty : {v2s64, v4s32, v8s16, v16s8}) {
  setAction({G_SHUFFLE_VECTOR, Ty}, ContentSensitiveLegalizer(/* Default case */ Lower));
  auto &Legalizer = getAction({G_SHUFFLE_VECTOR, Ty});
  if (hasFoo()) {
    Legalizer.addCase(isReversedElements(), Legal);
    Legalizer.addCase(isInterleavedElements(), Legal);
  }
  if (hasBar()) {
    Legalizer.addCase([](const MachineOperand &Op) { ... each 4-elements sub-sequence only shuffles within itself ... }, Legal);
    // Register-based shuffles handle everything.
    Legalizer.setDefaultCase(Legal);
  }
}
We can achieve the same effect with Custom but it seems nice to keep the high-level conditions together in AArch64LegalizerInfo::AArch64LegalizerInfo() instead of distributing them across various callbacks and re-checking them for every MachineInstr.

volkan closed this revision.Mar 21 2017, 1:56 AM

Revision Contents

Path

Size

include/

llvm/

CodeGen/

GlobalISel/

IRTranslator.h

5 lines

Target/

GenericOpcodes.td

7 lines

TargetOpcodes.def

5 lines

lib/

CodeGen/

GlobalISel/

IRTranslator.cpp

10 lines

test/

CodeGen/

AArch64/

GlobalISel/

arm64-irtranslator.ll

108 lines

ARM/

GlobalISel/

arm-irtranslator.ll

80 lines

Diff 92187

include/llvm/CodeGen/GlobalISel/IRTranslator.h

Show First 20 Lines • Show All 293 Lines • ▼ Show 20 Lines	private:
}		}

bool translateVAArg(const User &U, MachineIRBuilder &MIRBuilder);		bool translateVAArg(const User &U, MachineIRBuilder &MIRBuilder);

bool translateInsertElement(const User &U, MachineIRBuilder &MIRBuilder);		bool translateInsertElement(const User &U, MachineIRBuilder &MIRBuilder);

bool translateExtractElement(const User &U, MachineIRBuilder &MIRBuilder);		bool translateExtractElement(const User &U, MachineIRBuilder &MIRBuilder);

		bool translateShuffleVector(const User &U, MachineIRBuilder &MIRBuilder);

// Stubs to keep the compiler happy while we implement the rest of the		// Stubs to keep the compiler happy while we implement the rest of the
// translation.		// translation.
bool translateResume(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateResume(const User &U, MachineIRBuilder &MIRBuilder) {
return false;		return false;
}		}
bool translateCleanupRet(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateCleanupRet(const User &U, MachineIRBuilder &MIRBuilder) {
return false;		return false;
}		}
Show All 22 Lines	bool translateCatchPad(const User &U, MachineIRBuilder &MIRBuilder) {
return false;		return false;
}		}
bool translateUserOp1(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateUserOp1(const User &U, MachineIRBuilder &MIRBuilder) {
return false;		return false;
}		}
bool translateUserOp2(const User &U, MachineIRBuilder &MIRBuilder) {		bool translateUserOp2(const User &U, MachineIRBuilder &MIRBuilder) {
return false;		return false;
}		}
bool translateShuffleVector(const User &U, MachineIRBuilder &MIRBuilder) {
return false;
}

/// @}		/// @}

// Builder for machine instruction a la IRBuilder.		// Builder for machine instruction a la IRBuilder.
// I.e., compared to regular MIBuilder, this one also inserts the instruction		// I.e., compared to regular MIBuilder, this one also inserts the instruction
// in the current block, it can creates block, etc., basically a kind of		// in the current block, it can creates block, etc., basically a kind of
// IRBuilder, but for Machine IR.		// IRBuilder, but for Machine IR.
MachineIRBuilder CurBuilder;		MachineIRBuilder CurBuilder;
▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

include/llvm/Target/GenericOpcodes.td

	Show First 20 Lines • Show All 531 Lines • ▼ Show 20 Lines

	// Generic extractelement.			// Generic extractelement.
	def G_EXTRACT_VECTOR_ELT : Instruction {			def G_EXTRACT_VECTOR_ELT : Instruction {
	let OutOperandList = (outs type0:$dst);			let OutOperandList = (outs type0:$dst);
	let InOperandList = (ins type1:$src, type2:$idx);			let InOperandList = (ins type1:$src, type2:$idx);
	let hasSideEffects = 0;			let hasSideEffects = 0;
	}			}

				// Generic shufflevector.
				def G_SHUFFLE_VECTOR: Instruction {
				let OutOperandList = (outs type0:$dst);
				let InOperandList = (ins type1:$v1, type1:$v2, type2:$mask);
				let hasSideEffects = 0;
				}

	// TODO: Add the other generic opcodes.			// TODO: Add the other generic opcodes.

include/llvm/Target/TargetOpcodes.def

	Show First 20 Lines • Show All 395 Lines • ▼ Show 20 Lines
	HANDLE_TARGET_OPCODE(G_BR)			HANDLE_TARGET_OPCODE(G_BR)

	/// Generic insertelement.			/// Generic insertelement.
	HANDLE_TARGET_OPCODE(G_INSERT_VECTOR_ELT)			HANDLE_TARGET_OPCODE(G_INSERT_VECTOR_ELT)

	/// Generic extractelement.			/// Generic extractelement.
	HANDLE_TARGET_OPCODE(G_EXTRACT_VECTOR_ELT)			HANDLE_TARGET_OPCODE(G_EXTRACT_VECTOR_ELT)

				/// Generic shufflevector.
				HANDLE_TARGET_OPCODE(G_SHUFFLE_VECTOR)

	// TODO: Add more generic opcodes as we move along.			// TODO: Add more generic opcodes as we move along.

	/// Marker for the end of the generic opcode.			/// Marker for the end of the generic opcode.
	/// This is used to check if an opcode is in the range of the			/// This is used to check if an opcode is in the range of the
	/// generic opcodes.			/// generic opcodes.
	HANDLE_TARGET_OPCODE_MARKER(PRE_ISEL_GENERIC_OPCODE_END, G_EXTRACT_VECTOR_ELT)			HANDLE_TARGET_OPCODE_MARKER(PRE_ISEL_GENERIC_OPCODE_END, G_SHUFFLE_VECTOR)

	/// BUILTIN_OP_END - This must be the last enum value in this list.			/// BUILTIN_OP_END - This must be the last enum value in this list.
	/// The target-specific post-isel opcode values start here.			/// The target-specific post-isel opcode values start here.
	HANDLE_TARGET_OPCODE_MARKER(GENERIC_OP_END, PRE_ISEL_GENERIC_OPCODE_END)			HANDLE_TARGET_OPCODE_MARKER(GENERIC_OP_END, PRE_ISEL_GENERIC_OPCODE_END)

lib/CodeGen/GlobalISel/IRTranslator.cpp

Show First 20 Lines • Show All 998 Lines • ▼ Show 20 Lines	if (U.getOperand(0)->getType()->getVectorNumElements() == 1) {
return true;		return true;
}		}
MIRBuilder.buildExtractVectorElement(getOrCreateVReg(U),		MIRBuilder.buildExtractVectorElement(getOrCreateVReg(U),
getOrCreateVReg(*U.getOperand(0)),		getOrCreateVReg(*U.getOperand(0)),
getOrCreateVReg(*U.getOperand(1)));		getOrCreateVReg(*U.getOperand(1)));
return true;		return true;
}		}

		bool IRTranslator::translateShuffleVector(const User &U,
		MachineIRBuilder &MIRBuilder) {
		MIRBuilder.buildInstr(TargetOpcode::G_SHUFFLE_VECTOR)
		.addDef(getOrCreateVReg(U))
		.addUse(getOrCreateVReg(*U.getOperand(0)))
		.addUse(getOrCreateVReg(*U.getOperand(1)))
		.addUse(getOrCreateVReg(*U.getOperand(2)));
		return true;
		}

bool IRTranslator::translatePHI(const User &U, MachineIRBuilder &MIRBuilder) {		bool IRTranslator::translatePHI(const User &U, MachineIRBuilder &MIRBuilder) {
const PHINode &PI = cast<PHINode>(U);		const PHINode &PI = cast<PHINode>(U);
auto MIB = MIRBuilder.buildInstr(TargetOpcode::PHI);		auto MIB = MIRBuilder.buildInstr(TargetOpcode::PHI);
MIB.addDef(getOrCreateVReg(PI));		MIB.addDef(getOrCreateVReg(PI));

PendingPHIs.emplace_back(&PI, MIB.getInstr());		PendingPHIs.emplace_back(&PI, MIB.getInstr());
return true;		return true;
}		}
▲ Show 20 Lines • Show All 205 Lines • Show Last 20 Lines

test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll

	Show First 20 Lines • Show All 1,395 Lines • ▼ Show 20 Lines
	; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1			; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
	; CHECK-NOT: G_MERGE_VALUES			; CHECK-NOT: G_MERGE_VALUES
	; CHECK: G_ADD [[ARG]], [[C1]]			; CHECK: G_ADD [[ARG]], [[C1]]
	%vec = insertelement <1 x i32> undef, i32 %arg, i32 0			%vec = insertelement <1 x i32> undef, i32 %arg, i32 0
	%add = add <1 x i32> %vec, <i32 1>			%add = add <1 x i32> %vec, <i32 1>
	%res = extractelement <1 x i32> %add, i32 0			%res = extractelement <1 x i32> %add, i32 0
	ret i32 %res			ret i32 %res
	}			}

				define <2 x i32> @test_shufflevector_s32_v2s32(i32 %arg) {
				; CHECK-LABEL: name: test_shufflevector_s32_v2s32
				; CHECK: [[ARG:%[0-9]+]](s32) = COPY %w0
				; CHECK: [[UNDEF:%[0-9]+]](s32) = IMPLICIT_DEF
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[MASK:%[0-9]+]](<2 x s32>) = G_MERGE_VALUES [[C0]](s32), [[C0]](s32)
				; CHECK: [[VEC:%[0-9]+]](<2 x s32>) = G_SHUFFLE_VECTOR [[ARG]](s32), [[UNDEF]], [[MASK]](<2 x s32>)
				; CHECK: %d0 = COPY [[VEC]](<2 x s32>)
				%vec = insertelement <1 x i32> undef, i32 %arg, i32 0
				%res = shufflevector <1 x i32> %vec, <1 x i32> undef, <2 x i32> zeroinitializer
				ret <2 x i32> %res
				}

				define i32 @test_shufflevector_v2s32_s32(<2 x i32> %arg) {
				; CHECK-LABEL: name: test_shufflevector_v2s32_s32
				; CHECK: [[ARG:%[0-9]+]](<2 x s32>) = COPY %d0
				; CHECK: [[UNDEF:%[0-9]+]](<2 x s32>) = IMPLICIT_DEF
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[RES:%[0-9]+]](s32) = G_SHUFFLE_VECTOR [[ARG]](<2 x s32>), [[UNDEF]], [[C1]](s32)
				abUnsubmitted Not Done Reply Inline Actions Should we turn this into an extractelt in the IRTranslator? Seems like having vector ops 0,1,2 is a nice invariant. (which would be nice to check in the verifier, or even in the MIR parser, if we ever add type constraints in GenericOpcodes.td) ab: Should we turn this into an extractelt in the IRTranslator? Seems like having vector ops 0,1,2…
				; CHECK: %w0 = COPY [[RES]](s32)
				%vec = shufflevector <2 x i32> %arg, <2 x i32> undef, <1 x i32> <i32 1>
				%res = extractelement <1 x i32> %vec, i32 0
				ret i32 %res
				}

				define <2 x i32> @test_shufflevector_v2s32_v2s32(<2 x i32> %arg) {
				; CHECK-LABEL: name: test_shufflevector_v2s32_v2s32
				; CHECK: [[ARG:%[0-9]+]](<2 x s32>) = COPY %d0
				; CHECK: [[UNDEF:%[0-9]+]](<2 x s32>) = IMPLICIT_DEF
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[MASK:%[0-9]+]](<2 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C0]](s32)
				; CHECK: [[VEC:%[0-9]+]](<2 x s32>) = G_SHUFFLE_VECTOR [[ARG]](<2 x s32>), [[UNDEF]], [[MASK]](<2 x s32>)
				; CHECK: %d0 = COPY [[VEC]](<2 x s32>)
				%res = shufflevector <2 x i32> %arg, <2 x i32> undef, <2 x i32> <i32 1, i32 0>
				ret <2 x i32> %res
				}

				define i32 @test_shufflevector_v2s32_v3s32(<2 x i32> %arg) {
				; CHECK-LABEL: name: test_shufflevector_v2s32_v3s32
				; CHECK: [[ARG:%[0-9]+]](<2 x s32>) = COPY %d0
				; CHECK: [[UNDEF:%[0-9]+]](<2 x s32>) = IMPLICIT_DEF
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[MASK:%[0-9]+]](<3 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C0]](s32), [[C1]](s32)
				; CHECK: [[VEC:%[0-9]+]](<3 x s32>) = G_SHUFFLE_VECTOR [[ARG]](<2 x s32>), [[UNDEF]], [[MASK]](<3 x s32>)
				; CHECK: G_EXTRACT_VECTOR_ELT [[VEC]](<3 x s32>)
				%vec = shufflevector <2 x i32> %arg, <2 x i32> undef, <3 x i32> <i32 1, i32 0, i32 1>
				%res = extractelement <3 x i32> %vec, i32 0
				ret i32 %res
				}

				define <4 x i32> @test_shufflevector_v2s32_v4s32(<2 x i32> %arg1, <2 x i32> %arg2) {
				; CHECK-LABEL: name: test_shufflevector_v2s32_v4s32
				; CHECK: [[ARG1:%[0-9]+]](<2 x s32>) = COPY %d0
				; CHECK: [[ARG2:%[0-9]+]](<2 x s32>) = COPY %d1
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[C2:%[0-9]+]](s32) = G_CONSTANT i32 2
				; CHECK: [[C3:%[0-9]+]](s32) = G_CONSTANT i32 3
				; CHECK: [[MASK:%[0-9]+]](<4 x s32>) = G_MERGE_VALUES [[C0]](s32), [[C1]](s32), [[C2]](s32), [[C3]](s32)
				; CHECK: [[VEC:%[0-9]+]](<4 x s32>) = G_SHUFFLE_VECTOR [[ARG1]](<2 x s32>), [[ARG2]], [[MASK]](<4 x s32>)
				; CHECK: %q0 = COPY [[VEC]](<4 x s32>)
				%res = shufflevector <2 x i32> %arg1, <2 x i32> %arg2, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
				ret <4 x i32> %res
				}

				define <2 x i32> @test_shufflevector_v4s32_v2s32(<4 x i32> %arg) {
				; CHECK-LABEL: name: test_shufflevector_v4s32_v2s32
				; CHECK: [[ARG:%[0-9]+]](<4 x s32>) = COPY %q0
				; CHECK: [[UNDEF:%[0-9]+]](<4 x s32>) = IMPLICIT_DEF
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[C3:%[0-9]+]](s32) = G_CONSTANT i32 3
				; CHECK: [[MASK:%[0-9]+]](<2 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C3]](s32)
				; CHECK: [[VEC:%[0-9]+]](<2 x s32>) = G_SHUFFLE_VECTOR [[ARG]](<4 x s32>), [[UNDEF]], [[MASK]](<2 x s32>)
				; CHECK: %d0 = COPY [[VEC]](<2 x s32>)
				%res = shufflevector <4 x i32> %arg, <4 x i32> undef, <2 x i32> <i32 1, i32 3>
				ret <2 x i32> %res
				}


				define <16 x i8> @test_shufflevector_v8s8_v16s8(<8 x i8> %arg1, <8 x i8> %arg2) {
				; CHECK-LABEL: name: test_shufflevector_v8s8_v16s8
				; CHECK: [[ARG1:%[0-9]+]](<8 x s8>) = COPY %d0
				; CHECK: [[ARG2:%[0-9]+]](<8 x s8>) = COPY %d1
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[C8:%[0-9]+]](s32) = G_CONSTANT i32 8
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[C9:%[0-9]+]](s32) = G_CONSTANT i32 9
				; CHECK: [[C2:%[0-9]+]](s32) = G_CONSTANT i32 2
				; CHECK: [[C10:%[0-9]+]](s32) = G_CONSTANT i32 10
				; CHECK: [[C3:%[0-9]+]](s32) = G_CONSTANT i32 3
				; CHECK: [[C11:%[0-9]+]](s32) = G_CONSTANT i32 11
				; CHECK: [[C4:%[0-9]+]](s32) = G_CONSTANT i32 4
				; CHECK: [[C12:%[0-9]+]](s32) = G_CONSTANT i32 12
				; CHECK: [[C5:%[0-9]+]](s32) = G_CONSTANT i32 5
				; CHECK: [[C13:%[0-9]+]](s32) = G_CONSTANT i32 13
				; CHECK: [[C6:%[0-9]+]](s32) = G_CONSTANT i32 6
				; CHECK: [[C14:%[0-9]+]](s32) = G_CONSTANT i32 14
				; CHECK: [[C7:%[0-9]+]](s32) = G_CONSTANT i32 7
				; CHECK: [[C15:%[0-9]+]](s32) = G_CONSTANT i32 15
				; CHECK: [[MASK:%[0-9]+]](<16 x s32>) = G_MERGE_VALUES [[C0]](s32), [[C8]](s32), [[C1]](s32), [[C9]](s32), [[C2]](s32), [[C10]](s32), [[C3]](s32), [[C11]](s32), [[C4]](s32), [[C12]](s32), [[C5]](s32), [[C13]](s32), [[C6]](s32), [[C14]](s32), [[C7]](s32), [[C15]](s32)
				; CHECK: [[VEC:%[0-9]+]](<16 x s8>) = G_SHUFFLE_VECTOR [[ARG1]](<8 x s8>), [[ARG2]], [[MASK]](<16 x s32>)
				; CHECK: %q0 = COPY [[VEC]](<16 x s8>)
				%res = shufflevector <8 x i8> %arg1, <8 x i8> %arg2, <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15>
				ret <16 x i8> %res
				}

test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll

	Show First 20 Lines • Show All 520 Lines • ▼ Show 20 Lines
	; LITTLE-DAG: %r1 = COPY [[R2]]			; LITTLE-DAG: %r1 = COPY [[R2]]
	; BIG-DAG: %r0 = COPY [[R2]]			; BIG-DAG: %r0 = COPY [[R2]]
	; BIG-DAG: %r1 = COPY [[R1]]			; BIG-DAG: %r1 = COPY [[R1]]
	; CHECK: BX_RET 14, _, implicit %r0, implicit %r1			; CHECK: BX_RET 14, _, implicit %r0, implicit %r1
	entry:			entry:
	%r = notail call arm_aapcscc double @aapcscc_fp_target(float %b, double %a, float %b, double %a)			%r = notail call arm_aapcscc double @aapcscc_fp_target(float %b, double %a, float %b, double %a)
	ret double %r			ret double %r
	}			}

				define i32 @test_shufflevector_s32_v2s32(i32 %arg) {
				; CHECK-LABEL: name: test_shufflevector_s32_v2s32
				; CHECK: [[ARG:%[0-9]+]](s32) = COPY %r0
				; CHECK: [[UNDEF:%[0-9]+]](s32) = IMPLICIT_DEF
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[MASK:%[0-9]+]](<2 x s32>) = G_MERGE_VALUES [[C0]](s32), [[C0]](s32)
				; CHECK: [[VEC:%[0-9]+]](<2 x s32>) = G_SHUFFLE_VECTOR [[ARG]](s32), [[UNDEF]], [[MASK]](<2 x s32>)
				; CHECK: G_EXTRACT_VECTOR_ELT [[VEC]](<2 x s32>)
				%vec = insertelement <1 x i32> undef, i32 %arg, i32 0
				%shuffle = shufflevector <1 x i32> %vec, <1 x i32> undef, <2 x i32> zeroinitializer
				%res = extractelement <2 x i32> %shuffle, i32 0
				ret i32 %res
				}

				define i32 @test_shufflevector_v2s32_v3s32(i32 %arg1, i32 %arg2) {
				; CHECK-LABEL: name: test_shufflevector_v2s32_v3s32
				; CHECK: [[ARG1:%[0-9]+]](s32) = COPY %r0
				; CHECK: [[ARG2:%[0-9]+]](s32) = COPY %r1
				; CHECK: [[UNDEF:%[0-9]+]](<2 x s32>) = IMPLICIT_DEF
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[MASK:%[0-9]+]](<3 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C0]](s32), [[C1]](s32)
				; CHECK: [[V1:%[0-9]+]](<2 x s32>) = G_INSERT_VECTOR_ELT [[UNDEF]], [[ARG1]](s32), [[C0]](s32)
				; CHECK: [[V2:%[0-9]+]](<2 x s32>) = G_INSERT_VECTOR_ELT [[V1]], [[ARG2]](s32), [[C1]](s32)
				; CHECK: [[VEC:%[0-9]+]](<3 x s32>) = G_SHUFFLE_VECTOR [[V2]](<2 x s32>), [[UNDEF]], [[MASK]](<3 x s32>)
				; CHECK: G_EXTRACT_VECTOR_ELT [[VEC]](<3 x s32>)
				%v1 = insertelement <2 x i32> undef, i32 %arg1, i32 0
				%v2 = insertelement <2 x i32> %v1, i32 %arg2, i32 1
				%shuffle = shufflevector <2 x i32> %v2, <2 x i32> undef, <3 x i32> <i32 1, i32 0, i32 1>
				%res = extractelement <3 x i32> %shuffle, i32 0
				ret i32 %res
				}


				define i32 @test_shufflevector_v2s32_v4s32(i32 %arg1, i32 %arg2) {
				; CHECK-LABEL: name: test_shufflevector_v2s32_v4s32
				; CHECK: [[ARG1:%[0-9]+]](s32) = COPY %r0
				; CHECK: [[ARG2:%[0-9]+]](s32) = COPY %r1
				; CHECK: [[UNDEF:%[0-9]+]](<2 x s32>) = IMPLICIT_DEF
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[MASK:%[0-9]+]](<4 x s32>) = G_MERGE_VALUES [[C0]](s32), [[C0]](s32), [[C0]](s32), [[C0]](s32)
				; CHECK: [[V1:%[0-9]+]](<2 x s32>) = G_INSERT_VECTOR_ELT [[UNDEF]], [[ARG1]](s32), [[C0]](s32)
				; CHECK: [[V2:%[0-9]+]](<2 x s32>) = G_INSERT_VECTOR_ELT [[V1]], [[ARG2]](s32), [[C1]](s32)
				; CHECK: [[VEC:%[0-9]+]](<4 x s32>) = G_SHUFFLE_VECTOR [[V2]](<2 x s32>), [[UNDEF]], [[MASK]](<4 x s32>)
				; CHECK: G_EXTRACT_VECTOR_ELT [[VEC]](<4 x s32>)
				%v1 = insertelement <2 x i32> undef, i32 %arg1, i32 0
				%v2 = insertelement <2 x i32> %v1, i32 %arg2, i32 1
				%shuffle = shufflevector <2 x i32> %v2, <2 x i32> undef, <4 x i32> zeroinitializer
				%res = extractelement <4 x i32> %shuffle, i32 0
				ret i32 %res
				}

				define i32 @test_shufflevector_v4s32_v2s32(i32 %arg1, i32 %arg2, i32 %arg3, i32 %arg4) {
				; CHECK-LABEL: name: test_shufflevector_v4s32_v2s32
				; CHECK: [[ARG1:%[0-9]+]](s32) = COPY %r0
				; CHECK: [[ARG2:%[0-9]+]](s32) = COPY %r1
				; CHECK: [[ARG3:%[0-9]+]](s32) = COPY %r2
				; CHECK: [[ARG4:%[0-9]+]](s32) = COPY %r3
				; CHECK: [[UNDEF:%[0-9]+]](<4 x s32>) = IMPLICIT_DEF
				; CHECK: [[C0:%[0-9]+]](s32) = G_CONSTANT i32 0
				; CHECK: [[C1:%[0-9]+]](s32) = G_CONSTANT i32 1
				; CHECK: [[C2:%[0-9]+]](s32) = G_CONSTANT i32 2
				; CHECK: [[C3:%[0-9]+]](s32) = G_CONSTANT i32 3
				; CHECK: [[MASK:%[0-9]+]](<2 x s32>) = G_MERGE_VALUES [[C1]](s32), [[C3]](s32)
				; CHECK: [[V1:%[0-9]+]](<4 x s32>) = G_INSERT_VECTOR_ELT [[UNDEF]], [[ARG1]](s32), [[C0]](s32)
				; CHECK: [[V2:%[0-9]+]](<4 x s32>) = G_INSERT_VECTOR_ELT [[V1]], [[ARG2]](s32), [[C1]](s32)
				; CHECK: [[V3:%[0-9]+]](<4 x s32>) = G_INSERT_VECTOR_ELT [[V2]], [[ARG3]](s32), [[C2]](s32)
				; CHECK: [[V4:%[0-9]+]](<4 x s32>) = G_INSERT_VECTOR_ELT [[V3]], [[ARG4]](s32), [[C3]](s32)
				; CHECK: [[VEC:%[0-9]+]](<2 x s32>) = G_SHUFFLE_VECTOR [[V4]](<4 x s32>), [[UNDEF]], [[MASK]](<2 x s32>)
				; CHECK: G_EXTRACT_VECTOR_ELT [[VEC]](<2 x s32>)
				%v1 = insertelement <4 x i32> undef, i32 %arg1, i32 0
				%v2 = insertelement <4 x i32> %v1, i32 %arg2, i32 1
				%v3 = insertelement <4 x i32> %v2, i32 %arg3, i32 2
				%v4 = insertelement <4 x i32> %v3, i32 %arg4, i32 3
				%shuffle = shufflevector <4 x i32> %v4, <4 x i32> undef, <2 x i32> <i32 1, i32 3>
				%res = extractelement <2 x i32> %shuffle, i32 0
				ret i32 %res
				}

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Translate shufflevectorClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 92187

include/llvm/CodeGen/GlobalISel/IRTranslator.h

include/llvm/Target/GenericOpcodes.td

include/llvm/Target/TargetOpcodes.def

lib/CodeGen/GlobalISel/IRTranslator.cpp

test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll

test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll

[GlobalISel] Translate shufflevector
ClosedPublic