This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add X86ISD::VSHLV and X86ISD::VSRLV nodes for psllv and psrlv
ClosedPublic

Authored by craig.topper on Jan 14 2019, 6:46 PM.

Download Raw Diff

Details

Reviewers

LuoYuanke
annita.zhang
spatel
RKSimon
zhutianyang

Commits

rG6f533811b5a4: Merging r351381: --------------------------------------------------------------…
rG59abdf5f3fea: [X86] Add X86ISD::VSHLV and X86ISD::VSRLV nodes for psllv and psrlv
rL351444: Merging r351381:
rL351381: [X86] Add X86ISD::VSHLV and X86ISD::VSRLV nodes for psllv and psrlv

Summary

Previously we used ISD::SHL and ISD::SRL to represent these in SelectionDAG. ISD::SHL/SRL interpret an out of range shift amount as undefined behavior and will constant fold to undef. While the intrinsics are defined to return 0 for out of range shift amounts. A previous patch added a special node for VPSRAV to produce all sign bits.

This was previously believed safe because undefs frequently get turned into 0 either from the constant pool or a desire to not have a false register dependency. But undef is treated specially in some optimizations. For example, its ignored in detection of vector splats. So if the ISD::SHL/SRL can be constant folded and all of the elements with in bounds shift amounts are the same, we might fold it to single element broadcast from the constant pool. This would not put 0s in the elements with out of bounds shift amounts.

Diff Detail

Event Timeline

zhutianyang created this revision.Jan 14 2019, 6:46 PM

Herald added a subscriber: llvm-commits. · View Herald TranscriptJan 14 2019, 6:46 PM

zhutianyang added reviewers: LuoYuanke, annita.zhang.Jan 14 2019, 6:49 PM

craig.topper added reviewers: spatel, RKSimon.Jan 14 2019, 7:04 PM

Diffusion mentioned this in rL351162: [X86] Add test cases for D56695. NFC.Jan 14 2019, 10:43 PM

I've committed the tests along with some formatting fixes, removal of unused operands, and a fix to minimize the diffs in avx-intrinsics-x86.ll. Can you rebase this patch and regenerate the test checks?

RKSimon added inline comments.Jan 15 2019, 1:25 AM

lib/Target/X86/X86InstrAVX512.td
6517	Duplication - wrap in defm macro?
lib/Target/X86/X86InstrSSE.td
8402	Worth wrapping these in a defm macro to reduce duplication?

In D56695#1357444, @craig.topper wrote:

I've committed the tests along with some formatting fixes, removal of unused operands, and a fix to minimize the diffs in avx-intrinsics-x86.ll. Can you rebase this patch and regenerate the test checks?

Ok, I will

lib/Target/X86/X86InstrAVX512.td
6517	Could you share a demo to explain how to wrap it? thanks

RKSimon added inline comments.Jan 15 2019, 3:42 AM

lib/Target/X86/X86InstrAVX512.td

6517

Something like (sorry not fully tested):

multiclass avx512_var_shift_int<string InstrStr, SDNode OpNode> {
  defm : avx512_var_shift_int_lowering<InstrStr#"W", OpNode, v8i16x_info, [HasVLX, HasBWI]>;
  defm : avx512_var_shift_int_lowering<InstrStr#"W", OpNode, v16i16x_info, [HasVLX, HasBWI]>;
  defm : avx512_var_shift_int_lowering<InstrStr#"W", OpNode, v32i16_info, [HasBWI]>;
  defm : avx512_var_shift_int_lowering_mb<InstrStr#"D", OpNode, v4i32x_info, [HasVLX]>;
  defm : avx512_var_shift_int_lowering_mb<InstrStr#"D", OpNode, v8i32x_info, [HasVLX]>;
  defm : avx512_var_shift_int_lowering_mb<InstrStr#"D", OpNode, v16i32_info, [HasAVX512]>;
  defm : avx512_var_shift_int_lowering_mb<InstrStr#"Q", OpNode, v2i64x_info, [HasVLX]>;
  defm : avx512_var_shift_int_lowering_mb<InstrStr#"Q", OpNode, v4i64x_info, [HasVLX]>;
  defm : avx512_var_shift_int_lowering_mb<InstrStr#"Q", OpNode, v8i64_info, [HasAVX512]>;
}
defm : avx512_var_shift_int<"VPSRAV", X86vsrav>;

zhutianyang marked an inline comment as done and an inline comment as not done.Jan 15 2019, 4:22 AM

zhutianyang added inline comments.

lib/Target/X86/X86InstrAVX512.td
6517	Thanks a lot. it is a clean method.

RKSimon added inline comments.Jan 15 2019, 5:12 AM

lib/Target/X86/X86InstrSSE.td
8402	We could even move the patterns inside avx2_var_shift now that all users want them.

zhutianyang added inline comments.Jan 15 2019, 5:23 AM

lib/Target/X86/X86InstrSSE.td
8402	I think you have some good ideas, could you share a demo again? To be frankly, I am not familial with td programming, Thanks.

RKSimon added inline comments.Jan 15 2019, 7:30 AM

lib/Target/X86/X86InstrSSE.td
8402	Actually this isn't going to work as VPSRAVQ doesn't exist on AVX2 - let's leave it as this.

Commandeering so I can rebase the tests in the interest of getting this through review quicker.

Rebase tests.

Move patterns into avx2_var_shift. Add multiclass in AVX512 to wrap the vector 3 vector lengths.

LGTM - this should also be merged into the 8.00 release branch, please create a merge request bug and block PR40331

This revision is now accepted and ready to land.Jan 16 2019, 1:19 PM

Closed by commit rL351381: [X86] Add X86ISD::VSHLV and X86ISD::VSRLV nodes for psllv and psrlv (authored by ctopper). · Explain WhyJan 16 2019, 1:50 PM

This revision was automatically updated to reflect the committed changes.

Merge request PR40343

Revision Contents

Path

Size

lib/

Target/

X86/

X86ISelLowering.h

6 lines

X86ISelLowering.cpp

2 lines

X86InstrAVX512.td

67 lines

X86InstrFragmentsSIMD.td

2 lines

X86InstrSSE.td

36 lines

X86IntrinsicsInfo.h

36 lines

test/

CodeGen/

X86/

avx2-intrinsics-x86.ll

482 lines

avx512-intrinsics.ll

64 lines

avx512bw-intrinsics.ll

53 lines

avx512bwvl-intrinsics.ll

108 lines

Diff 181696

lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 298 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
VFPROUND, VFPROUND_RND, VFPROUNDS_RND,		VFPROUND, VFPROUND_RND, VFPROUNDS_RND,

// 128-bit vector logical left / right shift		// 128-bit vector logical left / right shift
VSHLDQ, VSRLDQ,		VSHLDQ, VSRLDQ,

// Vector shift elements		// Vector shift elements
VSHL, VSRL, VSRA,		VSHL, VSRL, VSRA,

// Vector variable shift right arithmetic.		// Vector variable shift
// Unlike ISD::SRA, in case shift count greater then element size		VSHLV, VSRLV, VSRAV,
// use sign bit to fill destination data element.
VSRAV,

// Vector shift elements by immediate		// Vector shift elements by immediate
VSHLI, VSRLI, VSRAI,		VSHLI, VSRLI, VSRAI,

// Shifts of mask registers.		// Shifts of mask registers.
KSHIFTL, KSHIFTR,		KSHIFTL, KSHIFTR,

// Bit rotate by immediate		// Bit rotate by immediate
▲ Show 20 Lines • Show All 1,254 Lines • Show Last 20 Lines

lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 27,095 Lines • ▼ Show 20 Lines	const char *X86TargetLowering::getTargetNodeName(unsigned Opcode) const {
case X86ISD::VSHLDQ: return "X86ISD::VSHLDQ";		case X86ISD::VSHLDQ: return "X86ISD::VSHLDQ";
case X86ISD::VSRLDQ: return "X86ISD::VSRLDQ";		case X86ISD::VSRLDQ: return "X86ISD::VSRLDQ";
case X86ISD::VSHL: return "X86ISD::VSHL";		case X86ISD::VSHL: return "X86ISD::VSHL";
case X86ISD::VSRL: return "X86ISD::VSRL";		case X86ISD::VSRL: return "X86ISD::VSRL";
case X86ISD::VSRA: return "X86ISD::VSRA";		case X86ISD::VSRA: return "X86ISD::VSRA";
case X86ISD::VSHLI: return "X86ISD::VSHLI";		case X86ISD::VSHLI: return "X86ISD::VSHLI";
case X86ISD::VSRLI: return "X86ISD::VSRLI";		case X86ISD::VSRLI: return "X86ISD::VSRLI";
case X86ISD::VSRAI: return "X86ISD::VSRAI";		case X86ISD::VSRAI: return "X86ISD::VSRAI";
		case X86ISD::VSHLV: return "X86ISD::VSHLV";
		case X86ISD::VSRLV: return "X86ISD::VSRLV";
case X86ISD::VSRAV: return "X86ISD::VSRAV";		case X86ISD::VSRAV: return "X86ISD::VSRAV";
case X86ISD::VROTLI: return "X86ISD::VROTLI";		case X86ISD::VROTLI: return "X86ISD::VROTLI";
case X86ISD::VROTRI: return "X86ISD::VROTRI";		case X86ISD::VROTRI: return "X86ISD::VROTRI";
case X86ISD::VPPERM: return "X86ISD::VPPERM";		case X86ISD::VPPERM: return "X86ISD::VPPERM";
case X86ISD::CMPP: return "X86ISD::CMPP";		case X86ISD::CMPP: return "X86ISD::CMPP";
case X86ISD::PCMPEQ: return "X86ISD::PCMPEQ";		case X86ISD::PCMPEQ: return "X86ISD::PCMPEQ";
case X86ISD::PCMPGT: return "X86ISD::PCMPGT";		case X86ISD::PCMPGT: return "X86ISD::PCMPGT";
case X86ISD::PHMINPOS: return "X86ISD::PHMINPOS";		case X86ISD::PHMINPOS: return "X86ISD::PHMINPOS";
▲ Show 20 Lines • Show All 15,481 Lines • Show Last 20 Lines

lib/Target/X86/X86InstrAVX512.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 6,426 Lines • ▼ Show 20 Lines
	defm VPROLV : avx512_var_shift_types<0x15, "vprolv", rotl, SchedWriteVarVecShift>;			defm VPROLV : avx512_var_shift_types<0x15, "vprolv", rotl, SchedWriteVarVecShift>;

	defm : avx512_var_shift_lowering<avx512vl_i64_info, "VPSRAVQ", sra, [HasAVX512, NoVLX]>;			defm : avx512_var_shift_lowering<avx512vl_i64_info, "VPSRAVQ", sra, [HasAVX512, NoVLX]>;
	defm : avx512_var_shift_lowering<avx512vl_i16_info, "VPSLLVW", shl, [HasBWI, NoVLX]>;			defm : avx512_var_shift_lowering<avx512vl_i16_info, "VPSLLVW", shl, [HasBWI, NoVLX]>;
	defm : avx512_var_shift_lowering<avx512vl_i16_info, "VPSRAVW", sra, [HasBWI, NoVLX]>;			defm : avx512_var_shift_lowering<avx512vl_i16_info, "VPSRAVW", sra, [HasBWI, NoVLX]>;
	defm : avx512_var_shift_lowering<avx512vl_i16_info, "VPSRLVW", srl, [HasBWI, NoVLX]>;			defm : avx512_var_shift_lowering<avx512vl_i16_info, "VPSRLVW", srl, [HasBWI, NoVLX]>;

	// Special handing for handling VPSRAV intrinsics.			// Special handing for handling VPSRAV intrinsics.
	multiclass avx512_var_shift_int_lowering<string InstrStr, X86VectorVTInfo _,			multiclass avx512_var_shift_int_lowering<string InstrStr, SDNode OpNode,
	list<Predicate> p> {			X86VectorVTInfo _, list<Predicate> p> {
	let Predicates = p in {			let Predicates = p in {
	def : Pat<(_.VT (X86vsrav _.RC:$src1, _.RC:$src2)),			def : Pat<(_.VT (OpNode _.RC:$src1, _.RC:$src2)),
	(!cast<Instruction>(InstrStr#_.ZSuffix#rr) _.RC:$src1,			(!cast<Instruction>(InstrStr#_.ZSuffix#rr) _.RC:$src1,
	_.RC:$src2)>;			_.RC:$src2)>;
	def : Pat<(_.VT (X86vsrav _.RC:$src1, (_.LdFrag addr:$src2))),			def : Pat<(_.VT (OpNode _.RC:$src1, (_.LdFrag addr:$src2))),
	(!cast<Instruction>(InstrStr#_.ZSuffix##rm)			(!cast<Instruction>(InstrStr#_.ZSuffix##rm)
	_.RC:$src1, addr:$src2)>;			_.RC:$src1, addr:$src2)>;
	def : Pat<(_.VT (vselect _.KRCWM:$mask,			def : Pat<(_.VT (vselect _.KRCWM:$mask,
	(X86vsrav _.RC:$src1, _.RC:$src2), _.RC:$src0)),			(OpNode _.RC:$src1, _.RC:$src2), _.RC:$src0)),
	(!cast<Instruction>(InstrStr#_.ZSuffix#rrk) _.RC:$src0,			(!cast<Instruction>(InstrStr#_.ZSuffix#rrk) _.RC:$src0,
	_.KRC:$mask, _.RC:$src1, _.RC:$src2)>;			_.KRC:$mask, _.RC:$src1, _.RC:$src2)>;
	def : Pat<(_.VT (vselect _.KRCWM:$mask,			def : Pat<(_.VT (vselect _.KRCWM:$mask,
	(X86vsrav _.RC:$src1, (_.LdFrag addr:$src2)),			(OpNode _.RC:$src1, (_.LdFrag addr:$src2)),
	_.RC:$src0)),			_.RC:$src0)),
	(!cast<Instruction>(InstrStr#_.ZSuffix##rmk) _.RC:$src0,			(!cast<Instruction>(InstrStr#_.ZSuffix##rmk) _.RC:$src0,
	_.KRC:$mask, _.RC:$src1, addr:$src2)>;			_.KRC:$mask, _.RC:$src1, addr:$src2)>;
	def : Pat<(_.VT (vselect _.KRCWM:$mask,			def : Pat<(_.VT (vselect _.KRCWM:$mask,
	(X86vsrav _.RC:$src1, _.RC:$src2), _.ImmAllZerosV)),			(OpNode _.RC:$src1, _.RC:$src2), _.ImmAllZerosV)),
	(!cast<Instruction>(InstrStr#_.ZSuffix#rrkz) _.KRC:$mask,			(!cast<Instruction>(InstrStr#_.ZSuffix#rrkz) _.KRC:$mask,
	_.RC:$src1, _.RC:$src2)>;			_.RC:$src1, _.RC:$src2)>;
	def : Pat<(_.VT (vselect _.KRCWM:$mask,			def : Pat<(_.VT (vselect _.KRCWM:$mask,
	(X86vsrav _.RC:$src1, (_.LdFrag addr:$src2)),			(OpNode _.RC:$src1, (_.LdFrag addr:$src2)),
	_.ImmAllZerosV)),			_.ImmAllZerosV)),
	(!cast<Instruction>(InstrStr#_.ZSuffix##rmkz) _.KRC:$mask,			(!cast<Instruction>(InstrStr#_.ZSuffix##rmkz) _.KRC:$mask,
	_.RC:$src1, addr:$src2)>;			_.RC:$src1, addr:$src2)>;
	}			}
	}			}

	multiclass avx512_var_shift_int_lowering_mb<string InstrStr, X86VectorVTInfo _,			multiclass avx512_var_shift_int_lowering_mb<string InstrStr, SDNode OpNode,
				X86VectorVTInfo _,
	list<Predicate> p> :			list<Predicate> p> :
	avx512_var_shift_int_lowering<InstrStr, _, p> {			avx512_var_shift_int_lowering<InstrStr, OpNode, _, p> {
	let Predicates = p in {			let Predicates = p in {
	def : Pat<(_.VT (X86vsrav _.RC:$src1,			def : Pat<(_.VT (OpNode _.RC:$src1,
	(X86VBroadcast (_.ScalarLdFrag addr:$src2)))),			(X86VBroadcast (_.ScalarLdFrag addr:$src2)))),
	(!cast<Instruction>(InstrStr#_.ZSuffix##rmb)			(!cast<Instruction>(InstrStr#_.ZSuffix##rmb)
	_.RC:$src1, addr:$src2)>;			_.RC:$src1, addr:$src2)>;
	def : Pat<(_.VT (vselect _.KRCWM:$mask,			def : Pat<(_.VT (vselect _.KRCWM:$mask,
	(X86vsrav _.RC:$src1,			(OpNode _.RC:$src1,
	(X86VBroadcast (_.ScalarLdFrag addr:$src2))),			(X86VBroadcast (_.ScalarLdFrag addr:$src2))),
	_.RC:$src0)),			_.RC:$src0)),
	(!cast<Instruction>(InstrStr#_.ZSuffix##rmbk) _.RC:$src0,			(!cast<Instruction>(InstrStr#_.ZSuffix##rmbk) _.RC:$src0,
	_.KRC:$mask, _.RC:$src1, addr:$src2)>;			_.KRC:$mask, _.RC:$src1, addr:$src2)>;
	def : Pat<(_.VT (vselect _.KRCWM:$mask,			def : Pat<(_.VT (vselect _.KRCWM:$mask,
	(X86vsrav _.RC:$src1,			(OpNode _.RC:$src1,
	(X86VBroadcast (_.ScalarLdFrag addr:$src2))),			(X86VBroadcast (_.ScalarLdFrag addr:$src2))),
	_.ImmAllZerosV)),			_.ImmAllZerosV)),
	(!cast<Instruction>(InstrStr#_.ZSuffix##rmbkz) _.KRC:$mask,			(!cast<Instruction>(InstrStr#_.ZSuffix##rmbkz) _.KRC:$mask,
	_.RC:$src1, addr:$src2)>;			_.RC:$src1, addr:$src2)>;
	}			}
	}			}

	defm : avx512_var_shift_int_lowering<"VPSRAVW", v8i16x_info, [HasVLX, HasBWI]>;			defm : avx512_var_shift_int_lowering<"VPSRAVW", X86vsrav, v8i16x_info, [HasVLX, HasBWI]>;
	defm : avx512_var_shift_int_lowering<"VPSRAVW", v16i16x_info, [HasVLX, HasBWI]>;			defm : avx512_var_shift_int_lowering<"VPSRAVW", X86vsrav, v16i16x_info, [HasVLX, HasBWI]>;
	defm : avx512_var_shift_int_lowering<"VPSRAVW", v32i16_info, [HasBWI]>;			defm : avx512_var_shift_int_lowering<"VPSRAVW", X86vsrav, v32i16_info, [HasBWI]>;
	defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", v4i32x_info, [HasVLX]>;			defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", X86vsrav, v4i32x_info, [HasVLX]>;
	defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", v8i32x_info, [HasVLX]>;			defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", X86vsrav, v8i32x_info, [HasVLX]>;
	defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", v16i32_info, [HasAVX512]>;			defm : avx512_var_shift_int_lowering_mb<"VPSRAVD", X86vsrav, v16i32_info, [HasAVX512]>;
	defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", v2i64x_info, [HasVLX]>;			defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", X86vsrav, v2i64x_info, [HasVLX]>;
	defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", v4i64x_info, [HasVLX]>;			defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", X86vsrav, v4i64x_info, [HasVLX]>;
	defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", v8i64_info, [HasAVX512]>;			defm : avx512_var_shift_int_lowering_mb<"VPSRAVQ", X86vsrav, v8i64_info, [HasAVX512]>;

				defm : avx512_var_shift_int_lowering<"VPSRLVW", X86vsrlv, v8i16x_info, [HasVLX, HasBWI]>;
				defm : avx512_var_shift_int_lowering<"VPSRLVW", X86vsrlv, v16i16x_info, [HasVLX, HasBWI]>;
				defm : avx512_var_shift_int_lowering<"VPSRLVW", X86vsrlv, v32i16_info, [HasBWI]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSRLVD", X86vsrlv, v4i32x_info, [HasVLX]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSRLVD", X86vsrlv, v8i32x_info, [HasVLX]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSRLVD", X86vsrlv, v16i32_info, [HasAVX512]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSRLVQ", X86vsrlv, v2i64x_info, [HasVLX]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSRLVQ", X86vsrlv, v4i64x_info, [HasVLX]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSRLVQ", X86vsrlv, v8i64_info, [HasAVX512]>;

				defm : avx512_var_shift_int_lowering<"VPSLLVW", X86vshlv, v8i16x_info, [HasVLX, HasBWI]>;
				defm : avx512_var_shift_int_lowering<"VPSLLVW", X86vshlv, v16i16x_info, [HasVLX, HasBWI]>;
				defm : avx512_var_shift_int_lowering<"VPSLLVW", X86vshlv, v32i16_info, [HasBWI]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSLLVD", X86vshlv, v4i32x_info, [HasVLX]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSLLVD", X86vshlv, v8i32x_info, [HasVLX]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSLLVD", X86vshlv, v16i32_info, [HasAVX512]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSLLVQ", X86vshlv, v2i64x_info, [HasVLX]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSLLVQ", X86vshlv, v4i64x_info, [HasVLX]>;
				defm : avx512_var_shift_int_lowering_mb<"VPSLLVQ", X86vshlv, v8i64_info, [HasAVX512]>;
				RKSimonUnsubmitted Not Done Reply Inline Actions Duplication - wrap in defm macro? RKSimon: Duplication - wrap in defm macro?
				zhutianyangUnsubmitted Not Done Reply Inline Actions Could you share a demo to explain how to wrap it? thanks zhutianyang: Could you share a demo to explain how to wrap it? thanks
				RKSimonUnsubmitted Not Done Reply Inline Actions Something like (sorry not fully tested): multiclass avx512_var_shift_int<string InstrStr, SDNode OpNode> { defm : avx512_var_shift_int_lowering<InstrStr#"W", OpNode, v8i16x_info, [HasVLX, HasBWI]>; defm : avx512_var_shift_int_lowering<InstrStr#"W", OpNode, v16i16x_info, [HasVLX, HasBWI]>; defm : avx512_var_shift_int_lowering<InstrStr#"W", OpNode, v32i16_info, [HasBWI]>; defm : avx512_var_shift_int_lowering_mb<InstrStr#"D", OpNode, v4i32x_info, [HasVLX]>; defm : avx512_var_shift_int_lowering_mb<InstrStr#"D", OpNode, v8i32x_info, [HasVLX]>; defm : avx512_var_shift_int_lowering_mb<InstrStr#"D", OpNode, v16i32_info, [HasAVX512]>; defm : avx512_var_shift_int_lowering_mb<InstrStr#"Q", OpNode, v2i64x_info, [HasVLX]>; defm : avx512_var_shift_int_lowering_mb<InstrStr#"Q", OpNode, v4i64x_info, [HasVLX]>; defm : avx512_var_shift_int_lowering_mb<InstrStr#"Q", OpNode, v8i64_info, [HasAVX512]>; } defm : avx512_var_shift_int<"VPSRAV", X86vsrav>; RKSimon: Something like (sorry not fully tested): ``` multiclass avx512_var_shift_int<string InstrStr…
				zhutianyangUnsubmitted Done Reply Inline Actions Thanks a lot. it is a clean method. zhutianyang: Thanks a lot. it is a clean method.

	// Use 512bit VPROL/VPROLI version to implement v2i64/v4i64 + v4i32/v8i32 in case NoVLX.			// Use 512bit VPROL/VPROLI version to implement v2i64/v4i64 + v4i32/v8i32 in case NoVLX.
	let Predicates = [HasAVX512, NoVLX] in {			let Predicates = [HasAVX512, NoVLX] in {
	def : Pat<(v2i64 (rotl (v2i64 VR128X:$src1), (v2i64 VR128X:$src2))),			def : Pat<(v2i64 (rotl (v2i64 VR128X:$src1), (v2i64 VR128X:$src2))),
	(EXTRACT_SUBREG (v8i64			(EXTRACT_SUBREG (v8i64
	(VPROLVQZrr			(VPROLVQZrr
	(v8i64 (INSERT_SUBREG (IMPLICIT_DEF), VR128X:$src1, sub_xmm)),			(v8i64 (INSERT_SUBREG (IMPLICIT_DEF), VR128X:$src1, sub_xmm)),
	(v8i64 (INSERT_SUBREG (IMPLICIT_DEF), VR128X:$src2, sub_xmm)))),			(v8i64 (INSERT_SUBREG (IMPLICIT_DEF), VR128X:$src2, sub_xmm)))),
	▲ Show 20 Lines • Show All 5,918 Lines • Show Last 20 Lines

lib/Target/X86/X86InstrFragmentsSIMD.td

	Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines

	def X86vshl : SDNode<"X86ISD::VSHL", X86vshiftuniform>;			def X86vshl : SDNode<"X86ISD::VSHL", X86vshiftuniform>;
	def X86vsrl : SDNode<"X86ISD::VSRL", X86vshiftuniform>;			def X86vsrl : SDNode<"X86ISD::VSRL", X86vshiftuniform>;
	def X86vsra : SDNode<"X86ISD::VSRA", X86vshiftuniform>;			def X86vsra : SDNode<"X86ISD::VSRA", X86vshiftuniform>;

	def X86vshiftvariable : SDTypeProfile<1, 2, [SDTCisVec<0>, SDTCisSameAs<0,1>,			def X86vshiftvariable : SDTypeProfile<1, 2, [SDTCisVec<0>, SDTCisSameAs<0,1>,
	SDTCisSameAs<0,2>, SDTCisInt<0>]>;			SDTCisSameAs<0,2>, SDTCisInt<0>]>;

				def X86vshlv : SDNode<"X86ISD::VSHLV", X86vshiftvariable>;
				def X86vsrlv : SDNode<"X86ISD::VSRLV", X86vshiftvariable>;
	def X86vsrav : SDNode<"X86ISD::VSRAV", X86vshiftvariable>;			def X86vsrav : SDNode<"X86ISD::VSRAV", X86vshiftvariable>;

	def X86vshli : SDNode<"X86ISD::VSHLI", X86vshiftimm>;			def X86vshli : SDNode<"X86ISD::VSHLI", X86vshiftimm>;
	def X86vsrli : SDNode<"X86ISD::VSRLI", X86vshiftimm>;			def X86vsrli : SDNode<"X86ISD::VSRLI", X86vshiftimm>;
	def X86vsrai : SDNode<"X86ISD::VSRAI", X86vshiftimm>;			def X86vsrai : SDNode<"X86ISD::VSRAI", X86vshiftimm>;

	def X86kshiftl : SDNode<"X86ISD::KSHIFTL",			def X86kshiftl : SDNode<"X86ISD::KSHIFTL",
	SDTypeProfile<1, 2, [SDTCVecEltisVT<0, i1>,			SDTypeProfile<1, 2, [SDTCVecEltisVT<0, i1>,
	▲ Show 20 Lines • Show All 898 Lines • Show Last 20 Lines

lib/Target/X86/X86InstrSSE.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 8,350 Lines • ▼ Show 20 Lines

	let Predicates = [HasAVX2, NoVLX] in {			let Predicates = [HasAVX2, NoVLX] in {
	defm VPSLLVD : avx2_var_shift<0x47, "vpsllvd", shl, v4i32, v8i32>;			defm VPSLLVD : avx2_var_shift<0x47, "vpsllvd", shl, v4i32, v8i32>;
	defm VPSLLVQ : avx2_var_shift<0x47, "vpsllvq", shl, v2i64, v4i64>, VEX_W;			defm VPSLLVQ : avx2_var_shift<0x47, "vpsllvq", shl, v2i64, v4i64>, VEX_W;
	defm VPSRLVD : avx2_var_shift<0x45, "vpsrlvd", srl, v4i32, v8i32>;			defm VPSRLVD : avx2_var_shift<0x45, "vpsrlvd", srl, v4i32, v8i32>;
	defm VPSRLVQ : avx2_var_shift<0x45, "vpsrlvq", srl, v2i64, v4i64>, VEX_W;			defm VPSRLVQ : avx2_var_shift<0x45, "vpsrlvq", srl, v2i64, v4i64>, VEX_W;
	defm VPSRAVD : avx2_var_shift<0x46, "vpsravd", sra, v4i32, v8i32>;			defm VPSRAVD : avx2_var_shift<0x46, "vpsravd", sra, v4i32, v8i32>;

				def : Pat<(v4i32 (X86vshlv VR128:$src1, VR128:$src2)),
				(VPSLLVDrr VR128:$src1, VR128:$src2)>;
				def : Pat<(v4i32 (X86vshlv VR128:$src1, (load addr:$src2))),
				(VPSLLVDrm VR128:$src1, addr:$src2)>;
				def : Pat<(v8i32 (X86vshlv VR256:$src1, VR256:$src2)),
				(VPSLLVDYrr VR256:$src1, VR256:$src2)>;
				def : Pat<(v8i32 (X86vshlv VR256:$src1, (load addr:$src2))),
				(VPSLLVDYrm VR256:$src1, addr:$src2)>;

				def : Pat<(v2i64 (X86vshlv VR128:$src1, VR128:$src2)),
				(VPSLLVQrr VR128:$src1, VR128:$src2)>;
				def : Pat<(v2i64 (X86vshlv VR128:$src1, (load addr:$src2))),
				(VPSLLVQrm VR128:$src1, addr:$src2)>;
				def : Pat<(v4i64 (X86vshlv VR256:$src1, VR256:$src2)),
				(VPSLLVQYrr VR256:$src1, VR256:$src2)>;
				def : Pat<(v4i64 (X86vshlv VR256:$src1, (load addr:$src2))),
				(VPSLLVQYrm VR256:$src1, addr:$src2)>;

				def : Pat<(v4i32 (X86vsrlv VR128:$src1, VR128:$src2)),
				(VPSRLVDrr VR128:$src1, VR128:$src2)>;
				def : Pat<(v4i32 (X86vsrlv VR128:$src1, (load addr:$src2))),
				(VPSRLVDrm VR128:$src1, addr:$src2)>;
				def : Pat<(v8i32 (X86vsrlv VR256:$src1, VR256:$src2)),
				(VPSRLVDYrr VR256:$src1, VR256:$src2)>;
				def : Pat<(v8i32 (X86vsrlv VR256:$src1, (load addr:$src2))),
				(VPSRLVDYrm VR256:$src1, addr:$src2)>;

				def : Pat<(v2i64 (X86vsrlv VR128:$src1, VR128:$src2)),
				(VPSRLVQrr VR128:$src1, VR128:$src2)>;
				def : Pat<(v2i64 (X86vsrlv VR128:$src1, (load addr:$src2))),
				(VPSRLVQrm VR128:$src1, addr:$src2)>;
				def : Pat<(v4i64 (X86vsrlv VR256:$src1, VR256:$src2)),
				(VPSRLVQYrr VR256:$src1, VR256:$src2)>;
				def : Pat<(v4i64 (X86vsrlv VR256:$src1, (load addr:$src2))),
				(VPSRLVQYrm VR256:$src1, addr:$src2)>;

	def : Pat<(v4i32 (X86vsrav VR128:$src1, VR128:$src2)),			def : Pat<(v4i32 (X86vsrav VR128:$src1, VR128:$src2)),
	(VPSRAVDrr VR128:$src1, VR128:$src2)>;			(VPSRAVDrr VR128:$src1, VR128:$src2)>;
	def : Pat<(v4i32 (X86vsrav VR128:$src1, (load addr:$src2))),			def : Pat<(v4i32 (X86vsrav VR128:$src1, (load addr:$src2))),
	(VPSRAVDrm VR128:$src1, addr:$src2)>;			(VPSRAVDrm VR128:$src1, addr:$src2)>;
	def : Pat<(v8i32 (X86vsrav VR256:$src1, VR256:$src2)),			def : Pat<(v8i32 (X86vsrav VR256:$src1, VR256:$src2)),
	(VPSRAVDYrr VR256:$src1, VR256:$src2)>;			(VPSRAVDYrr VR256:$src1, VR256:$src2)>;
	def : Pat<(v8i32 (X86vsrav VR256:$src1, (load addr:$src2))),			def : Pat<(v8i32 (X86vsrav VR256:$src1, (load addr:$src2))),
	(VPSRAVDYrm VR256:$src1, addr:$src2)>;			(VPSRAVDYrm VR256:$src1, addr:$src2)>;
				RKSimonUnsubmitted Not Done Reply Inline Actions Worth wrapping these in a defm macro to reduce duplication? RKSimon: Worth wrapping these in a defm macro to reduce duplication?
				RKSimonUnsubmitted Not Done Reply Inline Actions We could even move the patterns inside avx2_var_shift now that all users want them. RKSimon: We could even move the patterns inside avx2_var_shift now that all users want them.
				zhutianyangUnsubmitted Not Done Reply Inline Actions I think you have some good ideas, could you share a demo again? To be frankly, I am not familial with td programming, Thanks. zhutianyang: I think you have some good ideas, could you share a demo again? To be frankly, I am not…
				RKSimonUnsubmitted Not Done Reply Inline Actions Actually this isn't going to work as VPSRAVQ doesn't exist on AVX2 - let's leave it as this. RKSimon: Actually this isn't going to work as VPSRAVQ doesn't exist on AVX2 - let's leave it as this.
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// VGATHER - GATHER Operations			// VGATHER - GATHER Operations

	// FIXME: Improve scheduling of gather instructions.			// FIXME: Improve scheduling of gather instructions.
	multiclass avx2_gather<bits<8> opc, string OpcodeStr, ValueType VTx,			multiclass avx2_gather<bits<8> opc, string OpcodeStr, ValueType VTx,
	ValueType VTy, PatFrag GatherNode128,			ValueType VTy, PatFrag GatherNode128,
	▲ Show 20 Lines • Show All 128 Lines • Show Last 20 Lines

lib/Target/X86/X86IntrinsicsInfo.h

Show First 20 Lines • Show All 333 Lines • ▼ Show 20 Lines	static const IntrinsicData IntrinsicsWithoutChain[] = {
X86_INTRINSIC_DATA(avx2_psad_bw, INTR_TYPE_2OP, X86ISD::PSADBW, 0),		X86_INTRINSIC_DATA(avx2_psad_bw, INTR_TYPE_2OP, X86ISD::PSADBW, 0),
X86_INTRINSIC_DATA(avx2_pshuf_b, INTR_TYPE_2OP, X86ISD::PSHUFB, 0),		X86_INTRINSIC_DATA(avx2_pshuf_b, INTR_TYPE_2OP, X86ISD::PSHUFB, 0),
X86_INTRINSIC_DATA(avx2_psll_d, INTR_TYPE_2OP, X86ISD::VSHL, 0),		X86_INTRINSIC_DATA(avx2_psll_d, INTR_TYPE_2OP, X86ISD::VSHL, 0),
X86_INTRINSIC_DATA(avx2_psll_q, INTR_TYPE_2OP, X86ISD::VSHL, 0),		X86_INTRINSIC_DATA(avx2_psll_q, INTR_TYPE_2OP, X86ISD::VSHL, 0),
X86_INTRINSIC_DATA(avx2_psll_w, INTR_TYPE_2OP, X86ISD::VSHL, 0),		X86_INTRINSIC_DATA(avx2_psll_w, INTR_TYPE_2OP, X86ISD::VSHL, 0),
X86_INTRINSIC_DATA(avx2_pslli_d, VSHIFT, X86ISD::VSHLI, 0),		X86_INTRINSIC_DATA(avx2_pslli_d, VSHIFT, X86ISD::VSHLI, 0),
X86_INTRINSIC_DATA(avx2_pslli_q, VSHIFT, X86ISD::VSHLI, 0),		X86_INTRINSIC_DATA(avx2_pslli_q, VSHIFT, X86ISD::VSHLI, 0),
X86_INTRINSIC_DATA(avx2_pslli_w, VSHIFT, X86ISD::VSHLI, 0),		X86_INTRINSIC_DATA(avx2_pslli_w, VSHIFT, X86ISD::VSHLI, 0),
X86_INTRINSIC_DATA(avx2_psllv_d, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx2_psllv_d, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx2_psllv_d_256, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx2_psllv_d_256, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx2_psllv_q, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx2_psllv_q, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx2_psllv_q_256, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx2_psllv_q_256, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx2_psra_d, INTR_TYPE_2OP, X86ISD::VSRA, 0),		X86_INTRINSIC_DATA(avx2_psra_d, INTR_TYPE_2OP, X86ISD::VSRA, 0),
X86_INTRINSIC_DATA(avx2_psra_w, INTR_TYPE_2OP, X86ISD::VSRA, 0),		X86_INTRINSIC_DATA(avx2_psra_w, INTR_TYPE_2OP, X86ISD::VSRA, 0),
X86_INTRINSIC_DATA(avx2_psrai_d, VSHIFT, X86ISD::VSRAI, 0),		X86_INTRINSIC_DATA(avx2_psrai_d, VSHIFT, X86ISD::VSRAI, 0),
X86_INTRINSIC_DATA(avx2_psrai_w, VSHIFT, X86ISD::VSRAI, 0),		X86_INTRINSIC_DATA(avx2_psrai_w, VSHIFT, X86ISD::VSRAI, 0),
X86_INTRINSIC_DATA(avx2_psrav_d, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx2_psrav_d, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx2_psrav_d_256, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx2_psrav_d_256, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx2_psrl_d, INTR_TYPE_2OP, X86ISD::VSRL, 0),		X86_INTRINSIC_DATA(avx2_psrl_d, INTR_TYPE_2OP, X86ISD::VSRL, 0),
X86_INTRINSIC_DATA(avx2_psrl_q, INTR_TYPE_2OP, X86ISD::VSRL, 0),		X86_INTRINSIC_DATA(avx2_psrl_q, INTR_TYPE_2OP, X86ISD::VSRL, 0),
X86_INTRINSIC_DATA(avx2_psrl_w, INTR_TYPE_2OP, X86ISD::VSRL, 0),		X86_INTRINSIC_DATA(avx2_psrl_w, INTR_TYPE_2OP, X86ISD::VSRL, 0),
X86_INTRINSIC_DATA(avx2_psrli_d, VSHIFT, X86ISD::VSRLI, 0),		X86_INTRINSIC_DATA(avx2_psrli_d, VSHIFT, X86ISD::VSRLI, 0),
X86_INTRINSIC_DATA(avx2_psrli_q, VSHIFT, X86ISD::VSRLI, 0),		X86_INTRINSIC_DATA(avx2_psrli_q, VSHIFT, X86ISD::VSRLI, 0),
X86_INTRINSIC_DATA(avx2_psrli_w, VSHIFT, X86ISD::VSRLI, 0),		X86_INTRINSIC_DATA(avx2_psrli_w, VSHIFT, X86ISD::VSRLI, 0),
X86_INTRINSIC_DATA(avx2_psrlv_d, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx2_psrlv_d, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx2_psrlv_d_256, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx2_psrlv_d_256, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx2_psrlv_q, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx2_psrlv_q, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx2_psrlv_q_256, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx2_psrlv_q_256, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx512_add_pd_512, INTR_TYPE_2OP, ISD::FADD, X86ISD::FADD_RND),		X86_INTRINSIC_DATA(avx512_add_pd_512, INTR_TYPE_2OP, ISD::FADD, X86ISD::FADD_RND),
X86_INTRINSIC_DATA(avx512_add_ps_512, INTR_TYPE_2OP, ISD::FADD, X86ISD::FADD_RND),		X86_INTRINSIC_DATA(avx512_add_ps_512, INTR_TYPE_2OP, ISD::FADD, X86ISD::FADD_RND),
X86_INTRINSIC_DATA(avx512_cmp_pd_128, CMP_MASK_CC, X86ISD::CMPM, 0),		X86_INTRINSIC_DATA(avx512_cmp_pd_128, CMP_MASK_CC, X86ISD::CMPM, 0),
X86_INTRINSIC_DATA(avx512_cmp_pd_256, CMP_MASK_CC, X86ISD::CMPM, 0),		X86_INTRINSIC_DATA(avx512_cmp_pd_256, CMP_MASK_CC, X86ISD::CMPM, 0),
X86_INTRINSIC_DATA(avx512_cmp_pd_512, CMP_MASK_CC, X86ISD::CMPM, X86ISD::CMPM_RND),		X86_INTRINSIC_DATA(avx512_cmp_pd_512, CMP_MASK_CC, X86ISD::CMPM, X86ISD::CMPM_RND),
X86_INTRINSIC_DATA(avx512_cmp_ps_128, CMP_MASK_CC, X86ISD::CMPM, 0),		X86_INTRINSIC_DATA(avx512_cmp_ps_128, CMP_MASK_CC, X86ISD::CMPM, 0),
X86_INTRINSIC_DATA(avx512_cmp_ps_256, CMP_MASK_CC, X86ISD::CMPM, 0),		X86_INTRINSIC_DATA(avx512_cmp_ps_256, CMP_MASK_CC, X86ISD::CMPM, 0),
X86_INTRINSIC_DATA(avx512_cmp_ps_512, CMP_MASK_CC, X86ISD::CMPM, X86ISD::CMPM_RND),		X86_INTRINSIC_DATA(avx512_cmp_ps_512, CMP_MASK_CC, X86ISD::CMPM, X86ISD::CMPM_RND),
▲ Show 20 Lines • Show All 566 Lines • ▼ Show 20 Lines	static const IntrinsicData IntrinsicsWithoutChain[] = {
X86_INTRINSIC_DATA(avx512_psad_bw_512, INTR_TYPE_2OP, X86ISD::PSADBW, 0),		X86_INTRINSIC_DATA(avx512_psad_bw_512, INTR_TYPE_2OP, X86ISD::PSADBW, 0),
X86_INTRINSIC_DATA(avx512_pshuf_b_512, INTR_TYPE_2OP, X86ISD::PSHUFB, 0),		X86_INTRINSIC_DATA(avx512_pshuf_b_512, INTR_TYPE_2OP, X86ISD::PSHUFB, 0),
X86_INTRINSIC_DATA(avx512_psll_d_512, INTR_TYPE_2OP, X86ISD::VSHL, 0),		X86_INTRINSIC_DATA(avx512_psll_d_512, INTR_TYPE_2OP, X86ISD::VSHL, 0),
X86_INTRINSIC_DATA(avx512_psll_q_512, INTR_TYPE_2OP, X86ISD::VSHL, 0),		X86_INTRINSIC_DATA(avx512_psll_q_512, INTR_TYPE_2OP, X86ISD::VSHL, 0),
X86_INTRINSIC_DATA(avx512_psll_w_512, INTR_TYPE_2OP, X86ISD::VSHL, 0),		X86_INTRINSIC_DATA(avx512_psll_w_512, INTR_TYPE_2OP, X86ISD::VSHL, 0),
X86_INTRINSIC_DATA(avx512_pslli_d_512, VSHIFT, X86ISD::VSHLI, 0),		X86_INTRINSIC_DATA(avx512_pslli_d_512, VSHIFT, X86ISD::VSHLI, 0),
X86_INTRINSIC_DATA(avx512_pslli_q_512, VSHIFT, X86ISD::VSHLI, 0),		X86_INTRINSIC_DATA(avx512_pslli_q_512, VSHIFT, X86ISD::VSHLI, 0),
X86_INTRINSIC_DATA(avx512_pslli_w_512, VSHIFT, X86ISD::VSHLI, 0),		X86_INTRINSIC_DATA(avx512_pslli_w_512, VSHIFT, X86ISD::VSHLI, 0),
X86_INTRINSIC_DATA(avx512_psllv_d_512, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx512_psllv_d_512, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx512_psllv_q_512, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx512_psllv_q_512, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx512_psllv_w_128, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx512_psllv_w_128, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx512_psllv_w_256, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx512_psllv_w_256, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx512_psllv_w_512, INTR_TYPE_2OP, ISD::SHL, 0),		X86_INTRINSIC_DATA(avx512_psllv_w_512, INTR_TYPE_2OP, X86ISD::VSHLV, 0),
X86_INTRINSIC_DATA(avx512_psra_d_512, INTR_TYPE_2OP, X86ISD::VSRA, 0),		X86_INTRINSIC_DATA(avx512_psra_d_512, INTR_TYPE_2OP, X86ISD::VSRA, 0),
X86_INTRINSIC_DATA(avx512_psra_q_128, INTR_TYPE_2OP, X86ISD::VSRA, 0),		X86_INTRINSIC_DATA(avx512_psra_q_128, INTR_TYPE_2OP, X86ISD::VSRA, 0),
X86_INTRINSIC_DATA(avx512_psra_q_256, INTR_TYPE_2OP, X86ISD::VSRA, 0),		X86_INTRINSIC_DATA(avx512_psra_q_256, INTR_TYPE_2OP, X86ISD::VSRA, 0),
X86_INTRINSIC_DATA(avx512_psra_q_512, INTR_TYPE_2OP, X86ISD::VSRA, 0),		X86_INTRINSIC_DATA(avx512_psra_q_512, INTR_TYPE_2OP, X86ISD::VSRA, 0),
X86_INTRINSIC_DATA(avx512_psra_w_512, INTR_TYPE_2OP, X86ISD::VSRA, 0),		X86_INTRINSIC_DATA(avx512_psra_w_512, INTR_TYPE_2OP, X86ISD::VSRA, 0),
X86_INTRINSIC_DATA(avx512_psrai_d_512, VSHIFT, X86ISD::VSRAI, 0),		X86_INTRINSIC_DATA(avx512_psrai_d_512, VSHIFT, X86ISD::VSRAI, 0),
X86_INTRINSIC_DATA(avx512_psrai_q_128, VSHIFT, X86ISD::VSRAI, 0),		X86_INTRINSIC_DATA(avx512_psrai_q_128, VSHIFT, X86ISD::VSRAI, 0),
X86_INTRINSIC_DATA(avx512_psrai_q_256, VSHIFT, X86ISD::VSRAI, 0),		X86_INTRINSIC_DATA(avx512_psrai_q_256, VSHIFT, X86ISD::VSRAI, 0),
X86_INTRINSIC_DATA(avx512_psrai_q_512, VSHIFT, X86ISD::VSRAI, 0),		X86_INTRINSIC_DATA(avx512_psrai_q_512, VSHIFT, X86ISD::VSRAI, 0),
X86_INTRINSIC_DATA(avx512_psrai_w_512, VSHIFT, X86ISD::VSRAI, 0),		X86_INTRINSIC_DATA(avx512_psrai_w_512, VSHIFT, X86ISD::VSRAI, 0),
X86_INTRINSIC_DATA(avx512_psrav_d_512, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx512_psrav_d_512, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx512_psrav_q_128, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx512_psrav_q_128, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx512_psrav_q_256, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx512_psrav_q_256, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx512_psrav_q_512, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx512_psrav_q_512, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx512_psrav_w_128, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx512_psrav_w_128, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx512_psrav_w_256, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx512_psrav_w_256, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx512_psrav_w_512, INTR_TYPE_2OP, X86ISD::VSRAV, 0),		X86_INTRINSIC_DATA(avx512_psrav_w_512, INTR_TYPE_2OP, X86ISD::VSRAV, 0),
X86_INTRINSIC_DATA(avx512_psrl_d_512, INTR_TYPE_2OP, X86ISD::VSRL, 0),		X86_INTRINSIC_DATA(avx512_psrl_d_512, INTR_TYPE_2OP, X86ISD::VSRL, 0),
X86_INTRINSIC_DATA(avx512_psrl_q_512, INTR_TYPE_2OP, X86ISD::VSRL, 0),		X86_INTRINSIC_DATA(avx512_psrl_q_512, INTR_TYPE_2OP, X86ISD::VSRL, 0),
X86_INTRINSIC_DATA(avx512_psrl_w_512, INTR_TYPE_2OP, X86ISD::VSRL, 0),		X86_INTRINSIC_DATA(avx512_psrl_w_512, INTR_TYPE_2OP, X86ISD::VSRL, 0),
X86_INTRINSIC_DATA(avx512_psrli_d_512, VSHIFT, X86ISD::VSRLI, 0),		X86_INTRINSIC_DATA(avx512_psrli_d_512, VSHIFT, X86ISD::VSRLI, 0),
X86_INTRINSIC_DATA(avx512_psrli_q_512, VSHIFT, X86ISD::VSRLI, 0),		X86_INTRINSIC_DATA(avx512_psrli_q_512, VSHIFT, X86ISD::VSRLI, 0),
X86_INTRINSIC_DATA(avx512_psrli_w_512, VSHIFT, X86ISD::VSRLI, 0),		X86_INTRINSIC_DATA(avx512_psrli_w_512, VSHIFT, X86ISD::VSRLI, 0),
X86_INTRINSIC_DATA(avx512_psrlv_d_512, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx512_psrlv_d_512, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx512_psrlv_q_512, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx512_psrlv_q_512, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx512_psrlv_w_128, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx512_psrlv_w_128, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx512_psrlv_w_256, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx512_psrlv_w_256, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx512_psrlv_w_512, INTR_TYPE_2OP, ISD::SRL, 0),		X86_INTRINSIC_DATA(avx512_psrlv_w_512, INTR_TYPE_2OP, X86ISD::VSRLV, 0),
X86_INTRINSIC_DATA(avx512_pternlog_d_128, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),		X86_INTRINSIC_DATA(avx512_pternlog_d_128, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),
X86_INTRINSIC_DATA(avx512_pternlog_d_256, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),		X86_INTRINSIC_DATA(avx512_pternlog_d_256, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),
X86_INTRINSIC_DATA(avx512_pternlog_d_512, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),		X86_INTRINSIC_DATA(avx512_pternlog_d_512, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),
X86_INTRINSIC_DATA(avx512_pternlog_q_128, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),		X86_INTRINSIC_DATA(avx512_pternlog_q_128, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),
X86_INTRINSIC_DATA(avx512_pternlog_q_256, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),		X86_INTRINSIC_DATA(avx512_pternlog_q_256, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),
X86_INTRINSIC_DATA(avx512_pternlog_q_512, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),		X86_INTRINSIC_DATA(avx512_pternlog_q_512, INTR_TYPE_4OP, X86ISD::VPTERNLOG, 0),
X86_INTRINSIC_DATA(avx512_rcp14_pd_128, INTR_TYPE_1OP_MASK, X86ISD::RCP14, 0),		X86_INTRINSIC_DATA(avx512_rcp14_pd_128, INTR_TYPE_1OP_MASK, X86ISD::RCP14, 0),
X86_INTRINSIC_DATA(avx512_rcp14_pd_256, INTR_TYPE_1OP_MASK, X86ISD::RCP14, 0),		X86_INTRINSIC_DATA(avx512_rcp14_pd_256, INTR_TYPE_1OP_MASK, X86ISD::RCP14, 0),
▲ Show 20 Lines • Show All 284 Lines • Show Last 20 Lines

test/CodeGen/X86/avx2-intrinsics-x86.ll

	Show First 20 Lines • Show All 1,173 Lines • ▼ Show 20 Lines
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psllv_d:			; AVX512VL-LABEL: test_x86_avx2_psllv_d:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsllvd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x47,0xc1]			; AVX512VL-NEXT: vpsllvd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x47,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}

				define <4 x i32> @test_x86_avx2_psllv_d_const(<4 x i32> %a0, <4 x i32> %a1) {
				; X86-AVX2-LABEL: test_x86_avx2_psllv_d_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %xmm2 ## xmm2 = [2,9,0,4294967295]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x15,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsllvd {{LCPI.*}},, %xmm2, %xmm2 ## encoding: [0xc4,0xe2,0x69,0x47,0x15,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %xmm3 ## xmm3 = [1,1,1,4294967295]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x1d,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsllvd %xmm3, %xmm3, %xmm3 ## encoding: [0xc4,0xe2,0x61,0x47,0xdb]
				; X86-AVX2: retl ## encoding: [0xc3]
				;
				; X86-AVX512VL-LABEL: test_x86_avx2_psllv_d_const:
				; X86-AVX512VL: ## %bb.0:
				; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %xmm2 ## EVEX TO VEX Compression xmm2 = [2,9,0,4294967295]
				; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x15,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vpsllvd {{LCPI.*}}, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x69,0x47,0x15,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %xmm3 ## EVEX TO VEX Compression xmm3 = [1,1,1,4294967295]
				; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x1d,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vpsllvd %xmm3, %xmm3, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x61,0x47,0xdb]
				; X86-AVX512VL: retl ## encoding: [0xc3]
				;
				; X64-AVX2-LABEL: test_x86_avx2_psllv_d_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm2 ## xmm2 = [2,9,0,4294967295]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x15,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsllvd {{LCPI.*}}(%rip), %xmm2, %xmm2 ## encoding: [0xc4,0xe2,0x69,0x47,0x15,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vmovdqa {{LCPI*}}%rip), %xmm3 ## xmm3 = [1,1,1,4294967295]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x1d,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsllvd %xmm3, %xmm3, %xmm3 ## encoding: [0xc4,0xe2,0x61,0x47,0xdb]
				; X64-AVX2: retq ## encoding: [0xc3]
				;
				; X64-AVX512VL-LABEL: test_x86_avx2_psllv_d_const:
				; X64-AVX512VL: ## %bb.0:
				; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm2 ## EVEX TO VEX Compression xmm2 = [2,9,0,4294967295]
				; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x15,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vpsllvd {{LCPI.*}}(%rip), %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x69,0x47,0x15,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm3 ## EVEX TO VEX Compression xmm3 = [1,1,1,4294967295]
				; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x1d,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vpsllvd %xmm3, %xmm3, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x61,0x47,0xdb]
				; X64-AVX512VL: retq ## encoding: [0xc3]
				%res0 = call <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32> <i32 2, i32 9, i32 0, i32 -1>, <4 x i32> <i32 1, i32 0, i32 33, i32 -1>) ; <<4 x i32>> [#uses=1]
				%res2 = add <4 x i32> %a0, %res0
				%res1 = call <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32> <i32 1, i32 1, i32 1, i32 -1>, <4 x i32> <i32 1, i32 1, i32 1, i32 -1>)
				%res3 = add <4 x i32> %a1, %res1
				%res4 = add <4 x i32> %res2, %res3
				ret <4 x i32> %res4
				}
	declare <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i32> @test_x86_avx2_psllv_d_256(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psllv_d_256(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psllv_d_256:			; AVX2-LABEL: test_x86_avx2_psllv_d_256:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x47,0xc1]			; AVX2-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x47,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psllv_d_256:			; AVX512VL-LABEL: test_x86_avx2_psllv_d_256:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x47,0xc1]			; AVX512VL-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x47,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}

				define <8 x i32> @test_x86_avx2_psllv_d_256_const(<8 x i32> %a0, <8 x i32> %a1) {
				; X86-AVX2-LABEL: test_x86_avx2_psllv_d_256_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %ymm2 ## ymm2 = [2,9,0,4294967295,3,7,4294967295,0]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x15,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsllvd {{LCPI.*}}, %ymm2, %ymm2 ## encoding: [0xc4,0xe2,0x6d,0x47,0x15,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %ymm3 ## ymm3 = [4,4,4,4,4,4,4,4294967295]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x1d,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsllvd {{LCPI\.*}}, %ymm3, %ymm3 ## encoding: [0xc4,0xe2,0x65,0x47,0x1d,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.}}, kind: FK_Data_4
				; X86-AVX2: retl ## encoding: [0xc3]

				; X86-AVX512VL-LABEL: test_x86_avx2_psllv_d_256_const:
				; X86-AVX512VL: ## %bb.0:
				; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %ymm2 ## EVEX TO VEX Compression ymm2 = [2,9,0,4294967295,3,7,4294967295,0]
				; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x15,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vpsllvd {{LCPI.*}}, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x6d,0x47,0x15,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %ymm3 ## EVEX TO VEX Compression ymm3 = [4,4,4,4,4,4,4,4294967295]
				; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x1d,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vpsllvd {{LCPI.*}}, %ymm3, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x65,0x47,0x1d,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL: retl ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psllv_d_256_const:
				; X64-AVX2-NEXT: ## %bb.0:
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm2 ## ymm2 = [2,9,0,4294967295,3,7,4294967295,0]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x15,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsllvd {{LCPI.*}}(%rip), %ymm2, %ymm2 ## encoding: [0xc4,0xe2,0x6d,0x47,0x15,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm3 ## ymm3 = [4,4,4,4,4,4,4,4294967295]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x1d,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsllvd {{LCPI.*}}(%rip), %ymm3, %ymm3 ## encoding: [0xc4,0xe2,0x65,0x47,0x1d,A,A,A,A]
				; X64-AVX2: retl ## encoding: [0xc3]

				; X64-AVX512VL-LABEL: test_x86_avx2_psllv_d_256_const:
				; X64-AVX512VL: ## %bb.0:
				; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm2 ## EVEX TO VEX Compression ymm2 = [2,9,0,4294967295,3,7,4294967295,0]
				; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x15,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vpsllvd {{LCPI.*}}(%rip), %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x6d,0x47,0x15,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm3 ## EVEX TO VEX Compression ymm3 = [4,4,4,4,4,4,4,4294967295]
				; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x1d,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vpsllvd {{LCPI.*}}(%rip), %ymm3, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x65,0x47,0x1d,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL: retq ## encoding: [0xc3]

				%res0 = call <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32> <i32 2, i32 9, i32 0, i32 -1, i32 3, i32 7, i32 -1, i32 0>, <8 x i32> <i32 1, i32 0, i32 33, i32 -1,i32 2, i32 0, i32 34, i32 -2>) ; <<8 x i32>> [#uses=1]
				%res2 = add <8 x i32> %a0, %res0
				%res1 = call <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32> <i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 -1>, <8 x i32> <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 -1>) ; <<8 x i32>> [#uses=1]
				%res3 = add <8 x i32> %a1, %res1
				%res4 = add <8 x i32> %res2, %res3
				ret <8 x i32> %res4
				}
	declare <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32>, <8 x i32>) nounwind readnone


	define <2 x i64> @test_x86_avx2_psllv_q(<2 x i64> %a0, <2 x i64> %a1) {			define <2 x i64> @test_x86_avx2_psllv_q(<2 x i64> %a0, <2 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psllv_q:			; AVX2-LABEL: test_x86_avx2_psllv_q:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x47,0xc1]			; AVX2-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x47,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psllv_q:			; AVX512VL-LABEL: test_x86_avx2_psllv_q:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x47,0xc1]			; AVX512VL-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x47,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
				define <2 x i64> @test_x86_avx2_psllv_q_const(<2 x i64> %a0, <2 x i64> %a1) {
				; X86-AVX2-LABEL: test_x86_avx2_psllv_q_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %xmm0 ## xmm0 = [4,0,4294967295,4294967295]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsllvq {{LCPI.*}}, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x47,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: retl ## encoding: [0xc3]

				; X86-AVX2-LABEL: test_x86_avx2_psllv_q_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %xmm0 ## EVEX TO VEX Compression xmm0 = [4,0,4294967295,4294967295]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsllvq {{LCPI.*}}, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x47,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: retl ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psllv_q_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm0 ## xmm0 = [4,18446744073709551615]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsllvq {{LCPI.*}}(%rip), %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x47,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: retq ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psllv_q_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vmovdqa {{LCPI.}}{{LCPI.}}(%rip), %xmm0 ## EVEX TO VEX Compression xmm0 = [4,18446744073709551615]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsllvq {{LCPI.*}}(%rip), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x47,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: retq ## encoding: [0xc3]
				%res = call <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64> <i64 4, i64 -1>, <2 x i64> <i64 1, i64 -1>)
				ret <2 x i64> %res
				}
	declare <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64>, <2 x i64>) nounwind readnone			declare <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64>, <2 x i64>) nounwind readnone


	define <4 x i64> @test_x86_avx2_psllv_q_256(<4 x i64> %a0, <4 x i64> %a1) {			define <4 x i64> @test_x86_avx2_psllv_q_256(<4 x i64> %a0, <4 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psllv_q_256:			; AVX2-LABEL: test_x86_avx2_psllv_q_256:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x47,0xc1]			; AVX2-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x47,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psllv_q_256:			; AVX512VL-LABEL: test_x86_avx2_psllv_q_256:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x47,0xc1]			; AVX512VL-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x47,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}

				define <4 x i64> @test_x86_avx2_psllv_q_256_const(<4 x i64> %a0, <4 x i64> %a1) {
				; X86-AVX2-LABEL: test_x86_avx2_psllv_q_256_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %ymm0 ## ymm0 = [4,0,4,0,4,0,4294967295,4294967295]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsllvq {{LCPI.*}}, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x47,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: retl ## encoding: [0xc3]

				; X86-AVX2-LABEL: test_x86_avx2_psllv_q_256_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %ymm0 ## EVEX TO VEX Compression ymm0 = [4,0,4,0,4,0,4294967295,4294967295]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsllvq {{LCPI.*}}, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x47,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: retl ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psllv_q_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm0 ## ymm0 = [4,4,4,18446744073709551615]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsllvq {{LCPI.*}}(%rip), %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x47,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: retq ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psllv_q_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm0 ## EVEX TO VEX Compression ymm0 = [4,4,4,18446744073709551615]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsllvq {{LCPI.*}}(%rip), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x47,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: retq ## encoding: [0xc3]
				%res = call <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64> <i64 4, i64 4, i64 4, i64 -1>, <4 x i64> <i64 1, i64 1, i64 1, i64 -1>)
				ret <4 x i64> %res
				}
	declare <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64>, <4 x i64>) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64>, <4 x i64>) nounwind readnone


	define <4 x i32> @test_x86_avx2_psrlv_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_avx2_psrlv_d(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrlv_d:			; AVX2-LABEL: test_x86_avx2_psrlv_d:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x45,0xc1]			; AVX2-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x45,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrlv_d:			; AVX512VL-LABEL: test_x86_avx2_psrlv_d:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x45,0xc1]			; AVX512VL-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x45,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}

				define <4 x i32> @test_x86_avx2_psrlv_d_const(<4 x i32> %a0, <4 x i32> %a1) {
				; X86-AVX-LABEL: test_x86_avx2_psrlv_d_const:
				; X86-AVX: ## %bb.0:
				; X86-AVX-NEXT: vmovdqa {{LCPI.*}}, %xmm2 ## xmm2 = [2,9,0,4294967295]
				; X86-AVX-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x15,A,A,A,A]
				; X86-AVX-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX-NEXT: vpsrlvd {{LCPI.*}}, %xmm2, %xmm2 ## encoding: [0xc4,0xe2,0x69,0x45,0x15,A,A,A,A]
				; X86-AVX-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX-NEXT: vmovdqa {{LCPI.*}}, %xmm3 ## xmm3 = [4,4,4,4294967295]
				; X86-AVX-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x1d,A,A,A,A]
				; X86-AVX-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX-NEXT: vpsrlvd {{LCPI.*}}, %xmm3, %xmm3 ## encoding: [0xc4,0xe2,0x61,0x45,0x1d,A,A,A,A]
				; X86-AVX-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX: retl ## encoding: [0xc3]

				; X86-AVX512VL-LABEL: test_x86_avx2_psrlv_d_const:
				; X86-AVX512VL: ## %bb.0:
				; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %xmm2 ## EVEX TO VEX Compression xmm2 = [2,9,0,4294967295]
				; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x15,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vpsrlvd {{LCPI.*}}, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x69,0x45,0x15,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %xmm3 ## EVEX TO VEX Compression xmm3 = [4,4,4,4294967295]
				; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x1d,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vpsrlvd {{LCPI.*}}, %xmm3, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x61,0x45,0x1d,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL: retl ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psrlv_d_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm2 ## xmm2 = [2,9,0,4294967295]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x15,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsrlvd {{LCPI.*}}(%rip), %xmm2, %xmm2 ## encoding: [0xc4,0xe2,0x69,0x45,0x15,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm3 ## xmm3 = [4,4,4,4294967295]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x1d,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsrlvd {{LCPI.*}}(%rip), %xmm3, %xmm3 ## encoding: [0xc4,0xe2,0x61,0x45,0x1d,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2: retq ## encoding: [0xc3]

				; X64-AVX512VL-LABEL: test_x86_avx2_psrlv_d_const:
				; X64-AVX512VL: ## %bb.0:
				; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm2 ## EVEX TO VEX Compression xmm2 = [2,9,0,4294967295]
				; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x15,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vpsrlvd {{LCPI.*}}(%rip), %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x69,0x45,0x15,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm3 ## EVEX TO VEX Compression xmm3 = [4,4,4,4294967295]
				; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x1d,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vpsrlvd {{LCPI.*}}(%rip), %xmm3, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x61,0x45,0x1d,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL: retq ## encoding: [0xc3]

				%res0 = call <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32> <i32 2, i32 9, i32 0, i32 -1>, <4 x i32> <i32 1, i32 0, i32 33, i32 -1>) ; <<4 x i32>> [#uses=1]
				%res2 = add <4 x i32> %a0, %res0
				%res1 = call <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32> <i32 4, i32 4, i32 4, i32 -1>, <4 x i32> <i32 1, i32 1, i32 1, i32 -1>) ; <<4 x i32>> [#uses=1]
				%res3 = add <4 x i32> %a1, %res1
				%res4 = add <4 x i32> %res2, %res3
				ret <4 x i32> %res4
				}
	declare <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i32> @test_x86_avx2_psrlv_d_256(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psrlv_d_256(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrlv_d_256:			; AVX2-LABEL: test_x86_avx2_psrlv_d_256:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x45,0xc1]			; AVX2-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x45,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrlv_d_256:			; AVX512VL-LABEL: test_x86_avx2_psrlv_d_256:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x45,0xc1]			; AVX512VL-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x45,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}

				define <8 x i32> @test_x86_avx2_psrlv_d_256_const(<8 x i32> %a0, <8 x i32> %a1) {
				; X86-AVX2-LABEL: test_x86_avx2_psrlv_d_256_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %ymm2 ## ymm2 = [2,9,0,4294967295,3,7,4294967295,0]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x15,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsrlvd {{LCPI.*}}, %ymm2, %ymm2 ## encoding: [0xc4,0xe2,0x6d,0x45,0x15,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %ymm3 ## ymm3 = [4,4,4,4,4,4,4,4294967295]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x1d,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsrlvd {{LCPI.*}}, %ymm3, %ymm3 ## encoding: [0xc4,0xe2,0x65,0x45,0x1d,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2: retl ## encoding: [0xc3]

				; X86-AVX512VL-LABEL: test_x86_avx2_psrlv_d_256_const:
				; X86-AVX512VL: ## %bb.0:
				; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %ymm2 ## EVEX TO VEX Compression ymm2 = [2,9,0,4294967295,3,7,4294967295,0]
				; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x15,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vpsrlvd {{LCPI.*}}, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x6d,0x45,0x15,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %ymm3 ## EVEX TO VEX Compression ymm3 = [4,4,4,4,4,4,4,4294967295]
				; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x1d,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL-NEXT: vpsrlvd {{LCPI.*}}, %ymm3, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x65,0x45,0x1d,A,A,A,A]
				; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX512VL: retl ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psrlv_d_256_const:
				; X64-AVX2-NEXT: ## %bb.0:
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm2 ## ymm2 = [2,9,0,4294967295,3,7,4294967295,0]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x15,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsrlvd {{LCPI.*}}(%rip), %ymm2, %ymm2 ## encoding: [0xc4,0xe2,0x6d,0x45,0x15,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm3 ## ymm3 = [4,4,4,4,4,4,4,4294967295]
				; X64-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x1d,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsrlvd {{LCPI.*}}(%rip), %ymm3, %ymm3 ## encoding: [0xc4,0xe2,0x65,0x45,0x1d,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2: retq ## encoding: [0xc3]

				; X64-AVX512VL-LABEL: test_x86_avx2_psrlv_d_256_const:
				; X64-AVX512VL: ## %bb.0:
				; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm2 ## EVEX TO VEX Compression ymm2 = [2,9,0,4294967295,3,7,4294967295,0]
				; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x15,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vpsrlvd {{LCPI.*}}(%rip), %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x6d,0x45,0x15,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm3 ## EVEX TO VEX Compression ymm3 = [4,4,4,4,4,4,4,4294967295]
				; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x1d,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX512VL-NEXT: vpsrlvd {{LCPI.*}}(%rip), %ymm3, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x65,0x45,0x1d,A,A,A,A]
				; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2: retq ## encoding: [0xc3]

				%res0 = call <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32> <i32 2, i32 9, i32 0, i32 -1, i32 3, i32 7, i32 -1, i32 0>, <8 x i32> <i32 1, i32 0, i32 33, i32 -1,i32 2, i32 0, i32 34, i32 -2>) ; <<8 x i32>> [#uses=1]
				%res2 = add <8 x i32> %a0, %res0
				%res1 = call <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32> <i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 -1>, <8 x i32> <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 -1>) ; <<8 x i32>> [#uses=1]
				%res3 = add <8 x i32> %a1, %res1
				%res4 = add <8 x i32> %res2, %res3
				ret <8 x i32> %res4
				}
	declare <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32>, <8 x i32>) nounwind readnone


	define <2 x i64> @test_x86_avx2_psrlv_q(<2 x i64> %a0, <2 x i64> %a1) {			define <2 x i64> @test_x86_avx2_psrlv_q(<2 x i64> %a0, <2 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrlv_q:			; AVX2-LABEL: test_x86_avx2_psrlv_q:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x45,0xc1]			; AVX2-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x45,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrlv_q:			; AVX512VL-LABEL: test_x86_avx2_psrlv_q:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x45,0xc1]			; AVX512VL-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x45,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}

				define <2 x i64> @test_x86_avx2_psrlv_q_const(<2 x i64> %a0, <2 x i64> %a1) {
				; X86-AVX2-LABEL: test_x86_avx2_psrlv_q_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %xmm0 ## xmm0 = [4,0,4,0]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsrlvq {{LCPI.*}}, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x45,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: retl ## encoding: [0xc3]

				; X86-AVX2-LABEL: test_x86_avx2_psrlv_q_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %xmm0 ## EVEX TO VEX Compression xmm0 = [4,0,4,0]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsrlvq {{LCPI.*}}, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x45,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: retl ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psrlv_q_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vpbroadcastq {{LCPI.*}}(%rip), %xmm0 ## xmm0 = [4,4]
				; X64-AVX2-NEXT: ## encoding: [0xc4,0xe2,0x79,0x59,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsrlvq {{LCPI.*}}(%rip), %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x45,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: retq ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psrlv_q_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vpbroadcastq {{LCPI.*}}(%rip), %xmm0 ## EVEX TO VEX Compression xmm0 = [4,4]
				; X64-AVX2-NEXT: ## encoding: [0xc4,0xe2,0x79,0x59,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsrlvq {{LCPI.*}}(%rip), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x45,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: retq ## encoding: [0xc3]
				%res = call <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64> <i64 4, i64 4>, <2 x i64> <i64 1, i64 -1>)
				ret <2 x i64> %res
				}
	declare <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64>, <2 x i64>) nounwind readnone			declare <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64>, <2 x i64>) nounwind readnone


	define <4 x i64> @test_x86_avx2_psrlv_q_256(<4 x i64> %a0, <4 x i64> %a1) {			define <4 x i64> @test_x86_avx2_psrlv_q_256(<4 x i64> %a0, <4 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrlv_q_256:			; AVX2-LABEL: test_x86_avx2_psrlv_q_256:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x45,0xc1]			; AVX2-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x45,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrlv_q_256:			; AVX512VL-LABEL: test_x86_avx2_psrlv_q_256:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x45,0xc1]			; AVX512VL-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x45,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}


				define <4 x i64> @test_x86_avx2_psrlv_q_256_const(<4 x i64> %a0, <4 x i64> %a1) {
				; X86-AVX2-LABEL: test_x86_avx2_psrlv_q_256_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %ymm0 ## ymm0 = [4,0,4,0,4,0,4,0]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsrlvq {{LCPI.*}}, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x45,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: retl ## encoding: [0xc3]

				; X86-AVX2-LABEL: test_x86_avx2_psrlv_q_256_const:
				; X86-AVX2: ## %bb.0:
				; X86-AVX2-NEXT: vmovdqa {{LCPI.*}}, %ymm0 ## EVEX TO VEX Compression ymm0 = [4,0,4,0,4,0,4,0]
				; X86-AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: vpsrlvq {{LCPI.*}}, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x45,0x05,A,A,A,A]
				; X86-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
				; X86-AVX2-NEXT: retl ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psrlv_q_256_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vpbroadcastq {{LCPI.*}}(%rip), %ymm0 ## ymm0 = [4,4,4,4]
				; X64-AVX2-NEXT: ## encoding: [0xc4,0xe2,0x7d,0x59,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsrlvq {{LCPI.*}}(%rip), %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x45,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: retq ## encoding: [0xc3]

				; X64-AVX2-LABEL: test_x86_avx2_psrlv_q_256_const:
				; X64-AVX2: ## %bb.0:
				; X64-AVX2-NEXT: vpbroadcastq {{LCPI.*}}(%rip), %ymm0 ## EVEX TO VEX Compression ymm0 = [4,4,4,4]
				; X64-AVX2-NEXT: ## encoding: [0xc4,0xe2,0x7d,0x59,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: vpsrlvq {{LCPI.*}}(%rip), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x45,0x05,A,A,A,A]
				; X64-AVX2-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
				; X64-AVX2-NEXT: retq ## encoding: [0xc3]
				%res = call <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64> <i64 4, i64 4, i64 4, i64 4>, <4 x i64> <i64 1, i64 1, i64 1, i64 -1>)
				ret <4 x i64> %res
				}
	declare <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64>, <4 x i64>) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64>, <4 x i64>) nounwind readnone


	define <4 x i32> @test_x86_avx2_psrav_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_avx2_psrav_d(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrav_d:			; AVX2-LABEL: test_x86_avx2_psrav_d:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsravd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0xc1]			; AVX2-NEXT: vpsravd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrav_d:			; AVX512VL-LABEL: test_x86_avx2_psrav_d:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsravd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0xc1]			; AVX512VL-NEXT: vpsravd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}


	define <4 x i32> @test_x86_avx2_psrav_d_const(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_avx2_psrav_d_const(<4 x i32> %a0, <4 x i32> %a1) {
	; X86-AVX-LABEL: test_x86_avx2_psrav_d_const:			; X86-AVX-LABEL: test_x86_avx2_psrav_d_const:
	; X86-AVX: ## %bb.0:			; X86-AVX: ## %bb.0:
	; X86-AVX-NEXT: vmovdqa {{.*#+}} xmm0 = [2,9,4294967284,23]			; X86-AVX-NEXT: vmovdqa {{LCPI.*}}, %xmm0 ## xmm0 = [2,9,4294967284,23]
	; X86-AVX-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]			; X86-AVX-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
	; X86-AVX-NEXT: ## fixup A - offset: 4, value: LCPI79_0, kind: FK_Data_4			; X86-AVX-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
	; X86-AVX-NEXT: vpsravd LCPI79_1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]			; X86-AVX-NEXT: vpsravd {{LCPI.*}}, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]
	; X86-AVX-NEXT: ## fixup A - offset: 5, value: LCPI79_1, kind: FK_Data_4			; X86-AVX-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
	; X86-AVX-NEXT: retl ## encoding: [0xc3]			; X86-AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; X86-AVX512VL-LABEL: test_x86_avx2_psrav_d_const:			; X86-AVX512VL-LABEL: test_x86_avx2_psrav_d_const:
	; X86-AVX512VL: ## %bb.0:			; X86-AVX512VL: ## %bb.0:
	; X86-AVX512VL-NEXT: vmovdqa LCPI79_0, %xmm0 ## EVEX TO VEX Compression xmm0 = [2,9,4294967284,23]			; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %xmm0 ## EVEX TO VEX Compression xmm0 = [2,9,4294967284,23]
	; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]			; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
	; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: LCPI79_0, kind: FK_Data_4			; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
	; X86-AVX512VL-NEXT: vpsravd LCPI79_1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]			; X86-AVX512VL-NEXT: vpsravd {{LCPI.*}}, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]
	; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: LCPI79_1, kind: FK_Data_4			; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
	; X86-AVX512VL-NEXT: retl ## encoding: [0xc3]			; X86-AVX512VL-NEXT: retl ## encoding: [0xc3]
	;			;
	; X64-AVX-LABEL: test_x86_avx2_psrav_d_const:			; X64-AVX-LABEL: test_x86_avx2_psrav_d_const:
	; X64-AVX: ## %bb.0:			; X64-AVX: ## %bb.0:
	; X64-AVX-NEXT: vmovdqa {{.*#+}} xmm0 = [2,9,4294967284,23]			; X64-AVX-NEXT: vmovdqa {{LCPI.*}} xmm0 = [2,9,4294967284,23]
	; X64-AVX-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]			; X64-AVX-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
	; X64-AVX-NEXT: ## fixup A - offset: 4, value: LCPI79_0-4, kind: reloc_riprel_4byte			; X64-AVX-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
	; X64-AVX-NEXT: vpsravd {{.*}}(%rip), %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]			; X64-AVX-NEXT: vpsravd {{LCPI.*}}(%rip), %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]
	; X64-AVX-NEXT: ## fixup A - offset: 5, value: LCPI79_1-4, kind: reloc_riprel_4byte			; X64-AVX-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
	; X64-AVX-NEXT: retq ## encoding: [0xc3]			; X64-AVX-NEXT: retq ## encoding: [0xc3]
	;			;
	; X64-AVX512VL-LABEL: test_x86_avx2_psrav_d_const:			; X64-AVX512VL-LABEL: test_x86_avx2_psrav_d_const:
	; X64-AVX512VL: ## %bb.0:			; X64-AVX512VL: ## %bb.0:
	; X64-AVX512VL-NEXT: vmovdqa {{.*}}(%rip), %xmm0 ## EVEX TO VEX Compression xmm0 = [2,9,4294967284,23]			; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %xmm0 ## EVEX TO VEX Compression xmm0 = [2,9,4294967284,23]
	; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]			; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
	; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: LCPI79_0-4, kind: reloc_riprel_4byte			; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
	; X64-AVX512VL-NEXT: vpsravd {{.*}}(%rip), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]			; X64-AVX512VL-NEXT: vpsravd {{LCPI.*}}(%rip), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]
	; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: LCPI79_1-4, kind: reloc_riprel_4byte			; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
	; X64-AVX512VL-NEXT: retq ## encoding: [0xc3]			; X64-AVX512VL-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> <i32 2, i32 9, i32 -12, i32 23>, <4 x i32> <i32 1, i32 18, i32 35, i32 52>)			%res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> <i32 2, i32 9, i32 -12, i32 23>, <4 x i32> <i32 1, i32 18, i32 35, i32 52>)
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i32> @test_x86_avx2_psrav_d_256(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psrav_d_256(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrav_d_256:			; AVX2-LABEL: test_x86_avx2_psrav_d_256:
	; AVX2: ## %bb.0:			; AVX2: ## %bb.0:
	; AVX2-NEXT: vpsravd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0xc1]			; AVX2-NEXT: vpsravd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0xc1]
	; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX2-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrav_d_256:			; AVX512VL-LABEL: test_x86_avx2_psrav_d_256:
	; AVX512VL: ## %bb.0:			; AVX512VL: ## %bb.0:
	; AVX512VL-NEXT: vpsravd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0xc1]			; AVX512VL-NEXT: vpsravd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0xc1]
	; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]			; AVX512VL-NEXT: ret{{[l\|q]}} ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}


	define <8 x i32> @test_x86_avx2_psrav_d_256_const(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psrav_d_256_const(<8 x i32> %a0, <8 x i32> %a1) {
	; X86-AVX-LABEL: test_x86_avx2_psrav_d_256_const:			; X86-AVX-LABEL: test_x86_avx2_psrav_d_256_const:
	; X86-AVX: ## %bb.0:			; X86-AVX: ## %bb.0:
	; X86-AVX-NEXT: vmovdqa {{.*#+}} ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]			; X86-AVX-NEXT: vmovdqa {{LCPI.*}}, %ymm0 ## ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]
	; X86-AVX-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]			; X86-AVX-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
	; X86-AVX-NEXT: ## fixup A - offset: 4, value: LCPI81_0, kind: FK_Data_4			; X86-AVX-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
	; X86-AVX-NEXT: vpsravd LCPI81_1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]			; X86-AVX-NEXT: vpsravd {{LCPI.*}}, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]
	; X86-AVX-NEXT: ## fixup A - offset: 5, value: LCPI81_1, kind: FK_Data_4			; X86-AVX-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
	; X86-AVX-NEXT: retl ## encoding: [0xc3]			; X86-AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; X86-AVX512VL-LABEL: test_x86_avx2_psrav_d_256_const:			; X86-AVX512VL-LABEL: test_x86_avx2_psrav_d_256_const:
	; X86-AVX512VL: ## %bb.0:			; X86-AVX512VL: ## %bb.0:
	; X86-AVX512VL-NEXT: vmovdqa LCPI81_0, %ymm0 ## EVEX TO VEX Compression ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]			; X86-AVX512VL-NEXT: vmovdqa {{LCPI.*}}, %ymm0 ## EVEX TO VEX Compression ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]
	; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]			; X86-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
	; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: LCPI81_0, kind: FK_Data_4			; X86-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: FK_Data_4
	; X86-AVX512VL-NEXT: vpsravd LCPI81_1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]			; X86-AVX512VL-NEXT: vpsravd {{LCPI.*}}, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]
	; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: LCPI81_1, kind: FK_Data_4			; X86-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: FK_Data_4
	; X86-AVX512VL-NEXT: retl ## encoding: [0xc3]			; X86-AVX512VL-NEXT: retl ## encoding: [0xc3]
	;			;
	; X64-AVX-LABEL: test_x86_avx2_psrav_d_256_const:			; X64-AVX-LABEL: test_x86_avx2_psrav_d_256_const:
	; X64-AVX: ## %bb.0:			; X64-AVX: ## %bb.0:
	; X64-AVX-NEXT: vmovdqa {{.*#+}} ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]			; X64-AVX-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm0 ## ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]
	; X64-AVX-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]			; X64-AVX-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
	; X64-AVX-NEXT: ## fixup A - offset: 4, value: LCPI81_0-4, kind: reloc_riprel_4byte			; X64-AVX-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
	; X64-AVX-NEXT: vpsravd {{.*}}(%rip), %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]			; X64-AVX-NEXT: vpsravd {{LCPI.*}}(%rip), %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]
	; X64-AVX-NEXT: ## fixup A - offset: 5, value: LCPI81_1-4, kind: reloc_riprel_4byte			; X64-AVX-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
	; X64-AVX-NEXT: retq ## encoding: [0xc3]			; X64-AVX-NEXT: retq ## encoding: [0xc3]
	;			;
	; X64-AVX512VL-LABEL: test_x86_avx2_psrav_d_256_const:			; X64-AVX512VL-LABEL: test_x86_avx2_psrav_d_256_const:
	; X64-AVX512VL: ## %bb.0:			; X64-AVX512VL: ## %bb.0:
	; X64-AVX512VL-NEXT: vmovdqa {{.*}}(%rip), %ymm0 ## EVEX TO VEX Compression ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]			; X64-AVX512VL-NEXT: vmovdqa {{LCPI.*}}(%rip), %ymm0 ## EVEX TO VEX Compression ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]
	; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]			; X64-AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
	; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: LCPI81_0-4, kind: reloc_riprel_4byte			; X64-AVX512VL-NEXT: ## fixup A - offset: 4, value: {{LCPI.*}}, kind: reloc_riprel_4byte
	; X64-AVX512VL-NEXT: vpsravd {{.*}}(%rip), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]			; X64-AVX512VL-NEXT: vpsravd {{LCPI.*}}(%rip), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]
	; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: LCPI81_1-4, kind: reloc_riprel_4byte			; X64-AVX512VL-NEXT: ## fixup A - offset: 5, value: {{LCPI.*}}, kind: reloc_riprel_4byte
	; X64-AVX512VL-NEXT: retq ## encoding: [0xc3]			; X64-AVX512VL-NEXT: retq ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32> <i32 2, i32 9, i32 -12, i32 23, i32 -26, i32 37, i32 -40, i32 51>, <8 x i32> <i32 1, i32 18, i32 35, i32 52, i32 69, i32 15, i32 32, i32 49>)			%res = call <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32> <i32 2, i32 9, i32 -12, i32 23, i32 -26, i32 37, i32 -40, i32 51>, <8 x i32> <i32 1, i32 18, i32 35, i32 52, i32 69, i32 15, i32 32, i32 49>)
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32>, <8 x i32>) nounwind readnone


	define <2 x double> @test_x86_avx2_gather_d_pd(<2 x double> %a0, i8* %a1, <4 x i32> %idx, <2 x double> %mask) {			define <2 x double> @test_x86_avx2_gather_d_pd(<2 x double> %a0, i8* %a1, <4 x i32> %idx, <2 x double> %mask) {
	; X86-LABEL: test_x86_avx2_gather_d_pd:			; X86-LABEL: test_x86_avx2_gather_d_pd:
	; X86: ## %bb.0:			; X86: ## %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; X86-NEXT: vgatherdpd %xmm2, (%eax,%xmm1,2), %xmm0 ## encoding: [0xc4,0xe2,0xe9,0x92,0x04,0x48]			; X86-NEXT: vgatherdpd %xmm2, (%eax,%xmm1,2), %xmm0 ## encoding: [0xc4,0xe2,0xe9,0x92,0x04,0x48]
	; X86-NEXT: retl ## encoding: [0xc3]			; X86-NEXT: retl ## encoding: [0xc3]
	;			;
	; X64-LABEL: test_x86_avx2_gather_d_pd:			; X64-LABEL: test_x86_avx2_gather_d_pd:
	▲ Show 20 Lines • Show All 328 Lines • Show Last 20 Lines

test/CodeGen/X86/avx512-intrinsics.ll

	Show First 20 Lines • Show All 5,220 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: test_x86_avx512_psllv_d_512:			; CHECK-LABEL: test_x86_avx512_psllv_d_512:
	; CHECK: ## %bb.0:			; CHECK: ## %bb.0:
	; CHECK-NEXT: vpsllvd %zmm1, %zmm0, %zmm0			; CHECK-NEXT: vpsllvd %zmm1, %zmm0, %zmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <16 x i32> @llvm.x86.avx512.psllv.d.512(<16 x i32> %a0, <16 x i32> %a1)			%res = call <16 x i32> @llvm.x86.avx512.psllv.d.512(<16 x i32> %a0, <16 x i32> %a1)
	ret <16 x i32> %res			ret <16 x i32> %res
	}			}

				define <16 x i32> @test_x86_avx512_psllv_d_512_const(<16 x i32> %a0, <16 x i32> %a1) {
				; CHECK-LABEL: test_x86_avx512_psllv_d_512_const:
				; CHECK: ## %bb.0:
				; CHECK-NEXT: vmovdqa64 {{LCPI.*}}(%rip), %zmm2 ## zmm2 = [2,9,0,4294967295,3,7,4294967295,0,4,5,4294967294,0,5,3,4294967293,0]
				; CHECK-NEXT: vpsllvd {{LCPI.*}}(%rip), %zmm2, %zmm2
				; CHECK-NEXT: vmovdqa64 {{LCPI.*}}(%rip), %zmm3 ## zmm3 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4294967295]
				; CHECK-NEXT: vpsllvd {{LCPI.*}}(%rip), %zmm3, %zmm3
				; CHECK: retq
				%res0 = call <16 x i32> @llvm.x86.avx512.psllv.d.512(<16 x i32> <i32 2, i32 9, i32 0, i32 -1, i32 3, i32 7, i32 -1, i32 0, i32 4, i32 5, i32 -2, i32 0, i32 5, i32 3, i32 -3, i32 0>, <16 x i32> <i32 1, i32 0, i32 33, i32 -1,i32 2, i32 0, i32 34, i32 -2, i32 3, i32 0, i32 35, i32 -1, i32 4, i32 0, i32 36, i32 -3>)
				%res2 = add <16 x i32> %a0, %res0
				%res1 = call <16 x i32> @llvm.x86.avx512.psllv.d.512(<16 x i32> <i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 -1>, <16 x i32> <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 -1>)
				%res3 = add <16 x i32> %a1, %res1
				%res4 = add <16 x i32> %res2, %res3
				ret <16 x i32> %res4
				}

	define <16 x i32> @test_x86_avx512_mask_psllv_d_512(<16 x i32> %a0, <16 x i32> %a1, <16 x i32> %a2, i16 %mask) {			define <16 x i32> @test_x86_avx512_mask_psllv_d_512(<16 x i32> %a0, <16 x i32> %a1, <16 x i32> %a2, i16 %mask) {
	; CHECK-LABEL: test_x86_avx512_mask_psllv_d_512:			; CHECK-LABEL: test_x86_avx512_mask_psllv_d_512:
	; CHECK: ## %bb.0:			; CHECK: ## %bb.0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vpsllvd %zmm1, %zmm0, %zmm2 {%k1}			; CHECK-NEXT: vpsllvd %zmm1, %zmm0, %zmm2 {%k1}
	; CHECK-NEXT: vmovdqa64 %zmm2, %zmm0			; CHECK-NEXT: vmovdqa64 %zmm2, %zmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <16 x i32> @llvm.x86.avx512.psllv.d.512(<16 x i32> %a0, <16 x i32> %a1)			%res = call <16 x i32> @llvm.x86.avx512.psllv.d.512(<16 x i32> %a0, <16 x i32> %a1)
	Show All 20 Lines
	; CHECK-LABEL: test_x86_avx512_psllv_q_512:			; CHECK-LABEL: test_x86_avx512_psllv_q_512:
	; CHECK: ## %bb.0:			; CHECK: ## %bb.0:
	; CHECK-NEXT: vpsllvq %zmm1, %zmm0, %zmm0			; CHECK-NEXT: vpsllvq %zmm1, %zmm0, %zmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <8 x i64> @llvm.x86.avx512.psllv.q.512(<8 x i64> %a0, <8 x i64> %a1)			%res = call <8 x i64> @llvm.x86.avx512.psllv.q.512(<8 x i64> %a0, <8 x i64> %a1)
	ret <8 x i64> %res			ret <8 x i64> %res
	}			}

				define <8 x i64> @test_x86_avx512_psllv_q_512_const(<8 x i64> %a0, <8 x i64> %a1) {
				; CHECK-LABEL: test_x86_avx512_psllv_q_512_const:
				; CHECK: ## %bb.0:
				; CHECK-NEXT: vmovdqa64 {{LCPI.*}}(%rip), %zmm2 ## zmm2 = [2,9,0,18446744073709551615,3,7,18446744073709551615,0]
				; CHECK-NEXT: vpsllvq {{LCPI.*}}(%rip), %zmm2, %zmm2
				; CHECK-NEXT: vmovdqa64 {{LCPI.*}}(%rip), %zmm3 ## zmm3 = [4,4,4,4,4,4,4,18446744073709551615]
				; CHECK-NEXT: vpsllvq {{LCPI.*}}(%rip), %zmm3, %zmm3
				; CHECK: retq
				%res0 = call <8 x i64> @llvm.x86.avx512.psllv.q.512(<8 x i64> <i64 2, i64 9, i64 0, i64 -1, i64 3, i64 7, i64 -1, i64 0>, <8 x i64> <i64 1, i64 0, i64 33, i64 -1,i64 2, i64 0, i64 34, i64 -2>)
				%res2 = add <8 x i64> %a0, %res0
				%res1 = call <8 x i64> @llvm.x86.avx512.psllv.q.512(<8 x i64> <i64 4, i64 4, i64 4, i64 4, i64 4, i64 4, i64 4, i64 -1>, <8 x i64> <i64 1, i64 1, i64 1, i64 1, i64 1, i64 1, i64 1, i64 -1>)
				%res3 = add <8 x i64> %a1, %res1
				%res4 = add <8 x i64> %res2, %res3
				ret <8 x i64> %res4
				}

	define <8 x i64> @test_x86_avx512_mask_psllv_q_512(<8 x i64> %a0, <8 x i64> %a1, <8 x i64> %a2, i8 %mask) {			define <8 x i64> @test_x86_avx512_mask_psllv_q_512(<8 x i64> %a0, <8 x i64> %a1, <8 x i64> %a2, i8 %mask) {
	; CHECK-LABEL: test_x86_avx512_mask_psllv_q_512:			; CHECK-LABEL: test_x86_avx512_mask_psllv_q_512:
	; CHECK: ## %bb.0:			; CHECK: ## %bb.0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vpsllvq %zmm1, %zmm0, %zmm2 {%k1}			; CHECK-NEXT: vpsllvq %zmm1, %zmm0, %zmm2 {%k1}
	; CHECK-NEXT: vmovdqa64 %zmm2, %zmm0			; CHECK-NEXT: vmovdqa64 %zmm2, %zmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <8 x i64> @llvm.x86.avx512.psllv.q.512(<8 x i64> %a0, <8 x i64> %a1)			%res = call <8 x i64> @llvm.x86.avx512.psllv.q.512(<8 x i64> %a0, <8 x i64> %a1)
	▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: test_x86_avx512_psrlv_d_512:			; CHECK-LABEL: test_x86_avx512_psrlv_d_512:
	; CHECK: ## %bb.0:			; CHECK: ## %bb.0:
	; CHECK-NEXT: vpsrlvd %zmm1, %zmm0, %zmm0			; CHECK-NEXT: vpsrlvd %zmm1, %zmm0, %zmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <16 x i32> @llvm.x86.avx512.psrlv.d.512(<16 x i32> %a0, <16 x i32> %a1)			%res = call <16 x i32> @llvm.x86.avx512.psrlv.d.512(<16 x i32> %a0, <16 x i32> %a1)
	ret <16 x i32> %res			ret <16 x i32> %res
	}			}

				define <16 x i32> @test_x86_avx512_psrlv_d_512_const(<16 x i32> %a0, <16 x i32> %a1) {
				; CHECK-LABEL: test_x86_avx512_psrlv_d_512_const:
				; CHECK: ## %bb.0:
				; CHECK-NEXT: vmovdqa64 {{LCPI.*}}(%rip), %zmm2 ## zmm2 = [2,9,0,4294967295,3,7,4294967295,0,4,5,4294967294,0,5,3,4294967293,0]
				; CHECK-NEXT: vpsrlvd {{LCPI.*}}(%rip), %zmm2, %zmm2
				; CHECK-NEXT: vmovdqa64 {{LCPI.*}}(%rip), %zmm3 ## zmm3 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4294967295]
				; CHECK-NEXT: vpsrlvd {{LCPI.*}}(%rip), %zmm3, %zmm3
				; CHECK: retq
				%res0 = call <16 x i32> @llvm.x86.avx512.psrlv.d.512(<16 x i32> <i32 2, i32 9, i32 0, i32 -1, i32 3, i32 7, i32 -1, i32 0, i32 4, i32 5, i32 -2, i32 0, i32 5, i32 3, i32 -3, i32 0>, <16 x i32> <i32 1, i32 0, i32 33, i32 -1,i32 2, i32 0, i32 34, i32 -2, i32 3, i32 0, i32 35, i32 -1, i32 4, i32 0, i32 36, i32 -3>)
				%res2 = add <16 x i32> %a0, %res0
				%res1 = call <16 x i32> @llvm.x86.avx512.psrlv.d.512(<16 x i32> <i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 4, i32 -1>, <16 x i32> <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 -1 >)
				%res3 = add <16 x i32> %a1, %res1
				%res4 = add <16 x i32> %res2, %res3
				ret <16 x i32> %res4
				}

	define <16 x i32> @test_x86_avx512_mask_psrlv_d_512(<16 x i32> %a0, <16 x i32> %a1, <16 x i32> %a2, i16 %mask) {			define <16 x i32> @test_x86_avx512_mask_psrlv_d_512(<16 x i32> %a0, <16 x i32> %a1, <16 x i32> %a2, i16 %mask) {
	; CHECK-LABEL: test_x86_avx512_mask_psrlv_d_512:			; CHECK-LABEL: test_x86_avx512_mask_psrlv_d_512:
	; CHECK: ## %bb.0:			; CHECK: ## %bb.0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vpsrlvd %zmm1, %zmm0, %zmm2 {%k1}			; CHECK-NEXT: vpsrlvd %zmm1, %zmm0, %zmm2 {%k1}
	; CHECK-NEXT: vmovdqa64 %zmm2, %zmm0			; CHECK-NEXT: vmovdqa64 %zmm2, %zmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <16 x i32> @llvm.x86.avx512.psrlv.d.512(<16 x i32> %a0, <16 x i32> %a1)			%res = call <16 x i32> @llvm.x86.avx512.psrlv.d.512(<16 x i32> %a0, <16 x i32> %a1)
	Show All 20 Lines
	; CHECK-LABEL: test_x86_avx512_psrlv_q_512:			; CHECK-LABEL: test_x86_avx512_psrlv_q_512:
	; CHECK: ## %bb.0:			; CHECK: ## %bb.0:
	; CHECK-NEXT: vpsrlvq %zmm1, %zmm0, %zmm0			; CHECK-NEXT: vpsrlvq %zmm1, %zmm0, %zmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <8 x i64> @llvm.x86.avx512.psrlv.q.512(<8 x i64> %a0, <8 x i64> %a1)			%res = call <8 x i64> @llvm.x86.avx512.psrlv.q.512(<8 x i64> %a0, <8 x i64> %a1)
	ret <8 x i64> %res			ret <8 x i64> %res
	}			}

				define <8 x i64> @test_x86_avx512_psrlv_q_512_const(<8 x i64> %a0, <8 x i64> %a1) {
				; CHECK-LABEL: test_x86_avx512_psrlv_q_512_const:
				; CHECK: ## %bb.0:
				; CHECK-NEXT: vmovdqa64 {{LCPI.*}}(%rip), %zmm2 ## zmm2 = [2,9,0,18446744073709551615,3,7,18446744073709551615,0]
				; CHECK-NEXT: vpsrlvq {{LCPI.*}}(%rip), %zmm2, %zmm2
				; CHECK-NEXT: vmovdqa64 {{LCPI.*}}(%rip), %zmm3 ## zmm3 = [4,4,4,4,4,4,4,18446744073709551615]
				; CHECK-NEXT: vpsrlvq {{LCPI.*}}(%rip), %zmm3, %zmm3
				; CHECK: retq
				%res0 = call <8 x i64> @llvm.x86.avx512.psrlv.q.512(<8 x i64> <i64 2, i64 9, i64 0, i64 -1, i64 3, i64 7, i64 -1, i64 0>, <8 x i64> <i64 1, i64 0, i64 33, i64 -1,i64 2, i64 0, i64 34, i64 -2>)
				%res2 = add <8 x i64> %a0, %res0
				%res1 = call <8 x i64> @llvm.x86.avx512.psrlv.q.512(<8 x i64> <i64 4, i64 4, i64 4, i64 4, i64 4, i64 4, i64 4, i64 -1>, <8 x i64> <i64 1, i64 1, i64 1, i64 1, i64 1, i64 1, i64 1, i64 -1>)
				%res3 = add <8 x i64> %a1, %res1
				%res4 = add <8 x i64> %res2, %res3
				ret <8 x i64> %res4
				}

	define <8 x i64> @test_x86_avx512_mask_psrlv_q_512(<8 x i64> %a0, <8 x i64> %a1, <8 x i64> %a2, i8 %mask) {			define <8 x i64> @test_x86_avx512_mask_psrlv_q_512(<8 x i64> %a0, <8 x i64> %a1, <8 x i64> %a2, i8 %mask) {
	; CHECK-LABEL: test_x86_avx512_mask_psrlv_q_512:			; CHECK-LABEL: test_x86_avx512_mask_psrlv_q_512:
	; CHECK: ## %bb.0:			; CHECK: ## %bb.0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vpsrlvq %zmm1, %zmm0, %zmm2 {%k1}			; CHECK-NEXT: vpsrlvq %zmm1, %zmm0, %zmm2 {%k1}
	; CHECK-NEXT: vmovdqa64 %zmm2, %zmm0			; CHECK-NEXT: vmovdqa64 %zmm2, %zmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <8 x i64> @llvm.x86.avx512.psrlv.q.512(<8 x i64> %a0, <8 x i64> %a1)			%res = call <8 x i64> @llvm.x86.avx512.psrlv.q.512(<8 x i64> %a0, <8 x i64> %a1)
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

test/CodeGen/X86/avx512bw-intrinsics.ll

Show First 20 Lines • Show All 1,147 Lines • ▼ Show 20 Lines
; CHECK-NEXT: vpaddq %zmm0, %zmm1, %zmm0 # encoding: [0x62,0xf1,0xf5,0x48,0xd4,0xc0]		; CHECK-NEXT: vpaddq %zmm0, %zmm1, %zmm0 # encoding: [0x62,0xf1,0xf5,0x48,0xd4,0xc0]
; CHECK-NEXT: ret{{[l\|q]}} # encoding: [0xc3]		; CHECK-NEXT: ret{{[l\|q]}} # encoding: [0xc3]
%res = call <8 x i64> @llvm.x86.avx512.psad.bw.512(<64 x i8> %x0, <64 x i8> %x1)		%res = call <8 x i64> @llvm.x86.avx512.psad.bw.512(<64 x i8> %x0, <64 x i8> %x1)
%res1 = call <8 x i64> @llvm.x86.avx512.psad.bw.512(<64 x i8> %x0, <64 x i8> %x2)		%res1 = call <8 x i64> @llvm.x86.avx512.psad.bw.512(<64 x i8> %x0, <64 x i8> %x2)
%res2 = add <8 x i64> %res, %res1		%res2 = add <8 x i64> %res, %res1
ret <8 x i64> %res2		ret <8 x i64> %res2
}		}

		declare <32 x i16> @llvm.x86.avx512.psrlv.w.512(<32 x i16>, <32 x i16>) nounwind readnone

		define <32 x i16> @test_x86_avx512_psrlv_w_512_const(<32 x i16> %x0, <32 x i16> %x1) optsize {
		; X86-LABEL: test_x86_avx512_psrlv_w_512_const:
		; X86: # %bb.0:
		; X86-NEXT: vmovdqa64 {{\.LCPI.*}}, %zmm0 # zmm0 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,65535]
		; X86-NEXT: # encoding: [0x62,0xf1,0xfd,0x48,0x6f,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: vpsrlvw {{\.LCPI.*}}, %zmm0, %zmm0 # encoding: [0x62,0xf2,0xfd,0x48,0x10,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: retl # encoding: [0xc3]

		; X64-LABEL: test_x86_avx512_psrlv_w_512_const:
		; X64: # %bb.0:
		; X64-NEXT: vmovdqa64 {{\.LCPI.*}}(%rip), %zmm0 # zmm0 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,65535]
		; X64-NEXT: # encoding: [0x62,0xf1,0xfd,0x48,0x6f,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: vpsrlvw {{\.LCPI.*}}(%rip), %zmm0, %zmm0 # encoding: [0x62,0xf2,0xfd,0x48,0x10,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: retq # encoding: [0xc3]

		%res1 = call <32 x i16> @llvm.x86.avx512.psrlv.w.512(
		<32 x i16> <i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 -1>,
		<32 x i16> <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 -1>)
		ret <32 x i16> %res1
		}

declare <32 x i16> @llvm.x86.avx512.mask.psrlv32hi(<32 x i16>, <32 x i16>, <32 x i16>, i32)		declare <32 x i16> @llvm.x86.avx512.mask.psrlv32hi(<32 x i16>, <32 x i16>, <32 x i16>, i32)

define <32 x i16>@test_int_x86_avx512_mask_psrlv32hi(<32 x i16> %x0, <32 x i16> %x1, <32 x i16> %x2, i32 %x3) {		define <32 x i16>@test_int_x86_avx512_mask_psrlv32hi(<32 x i16> %x0, <32 x i16> %x1, <32 x i16> %x2, i32 %x3) {
; X86-LABEL: test_int_x86_avx512_mask_psrlv32hi:		; X86-LABEL: test_int_x86_avx512_mask_psrlv32hi:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: vpsrlvw %zmm1, %zmm0, %zmm3 # encoding: [0x62,0xf2,0xfd,0x48,0x10,0xd9]		; X86-NEXT: vpsrlvw %zmm1, %zmm0, %zmm3 # encoding: [0x62,0xf2,0xfd,0x48,0x10,0xd9]
; X86-NEXT: kmovd {{[0-9]+}}(%esp), %k1 # encoding: [0xc4,0xe1,0xf9,0x90,0x4c,0x24,0x04]		; X86-NEXT: kmovd {{[0-9]+}}(%esp), %k1 # encoding: [0xc4,0xe1,0xf9,0x90,0x4c,0x24,0x04]
; X86-NEXT: vpsrlvw %zmm1, %zmm0, %zmm2 {%k1} # encoding: [0x62,0xf2,0xfd,0x49,0x10,0xd1]		; X86-NEXT: vpsrlvw %zmm1, %zmm0, %zmm2 {%k1} # encoding: [0x62,0xf2,0xfd,0x49,0x10,0xd1]
▲ Show 20 Lines • Show All 179 Lines • ▼ Show 20 Lines	; X64-NEXT: retq # encoding: [0xc3]
%res = call <32 x i16> @llvm.x86.avx512.psll.w.512(<32 x i16> %a0, <8 x i16> %a1) ; <<32 x i16>> [#uses=1]		%res = call <32 x i16> @llvm.x86.avx512.psll.w.512(<32 x i16> %a0, <8 x i16> %a1) ; <<32 x i16>> [#uses=1]
%mask.cast = bitcast i32 %mask to <32 x i1>		%mask.cast = bitcast i32 %mask to <32 x i1>
%res2 = select <32 x i1> %mask.cast, <32 x i16> %res, <32 x i16> zeroinitializer		%res2 = select <32 x i1> %mask.cast, <32 x i16> %res, <32 x i16> zeroinitializer
ret <32 x i16> %res2		ret <32 x i16> %res2
}		}
declare <32 x i16> @llvm.x86.avx512.psll.w.512(<32 x i16>, <8 x i16>) nounwind readnone		declare <32 x i16> @llvm.x86.avx512.psll.w.512(<32 x i16>, <8 x i16>) nounwind readnone


		define <32 x i16> @test_x86_avx512_psllv_w_512_const(<32 x i16> %x0, <32 x i16> %x1) optsize {
		; X86-LABEL: test_x86_avx512_psllv_w_512_const:
		; X86: # %bb.0:
		; X86-NEXT: vmovdqa64 {{\.LCPI.*}}, %zmm0 # zmm0 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,65535]
		; X86-NEXT: # encoding: [0x62,0xf1,0xfd,0x48,0x6f,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: vpsllvw {{\.LCPI.*}}, %zmm0, %zmm0 # encoding: [0x62,0xf2,0xfd,0x48,0x12,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: retl # encoding: [0xc3]

		; X64-LABEL: test_x86_avx512_psllv_w_512_const:
		; X64: # %bb.0:
		; X64-NEXT: vmovdqa64 {{\.LCPI.*}}(%rip), %zmm0 # zmm0 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,65535]
		; X64-NEXT: # encoding: [0x62,0xf1,0xfd,0x48,0x6f,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: vpsllvw {{\.LCPI.*}}(%rip), %zmm0, %zmm0 # encoding: [0x62,0xf2,0xfd,0x48,0x12,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: retq # encoding: [0xc3]

		%res1 = call <32 x i16> @llvm.x86.avx512.psllv.w.512(
		<32 x i16> <i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 -1>,
		<32 x i16> <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 -1>)
		ret <32 x i16> %res1
		}
		declare <32 x i16> @llvm.x86.avx512.psllv.w.512(<32 x i16>, <32 x i16>) nounwind readnone

define <32 x i16> @test_x86_avx512_pslli_w_512(<32 x i16> %a0) {		define <32 x i16> @test_x86_avx512_pslli_w_512(<32 x i16> %a0) {
; CHECK-LABEL: test_x86_avx512_pslli_w_512:		; CHECK-LABEL: test_x86_avx512_pslli_w_512:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: vpsllw $7, %zmm0, %zmm0 # encoding: [0x62,0xf1,0x7d,0x48,0x71,0xf0,0x07]		; CHECK-NEXT: vpsllw $7, %zmm0, %zmm0 # encoding: [0x62,0xf1,0x7d,0x48,0x71,0xf0,0x07]
; CHECK-NEXT: ret{{[l\|q]}} # encoding: [0xc3]		; CHECK-NEXT: ret{{[l\|q]}} # encoding: [0xc3]
%res = call <32 x i16> @llvm.x86.avx512.pslli.w.512(<32 x i16> %a0, i32 7) ; <<32 x i16>> [#uses=1]		%res = call <32 x i16> @llvm.x86.avx512.pslli.w.512(<32 x i16> %a0, i32 7) ; <<32 x i16>> [#uses=1]
ret <32 x i16> %res		ret <32 x i16> %res
}		}
▲ Show 20 Lines • Show All 240 Lines • Show Last 20 Lines

test/CodeGen/X86/avx512bwvl-intrinsics.ll

Show First 20 Lines • Show All 2,008 Lines • ▼ Show 20 Lines	; X64-NEXT: retq # encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}


		define <8 x i16> @test_int_x86_avx512_psrlv_w_128_const(<8 x i16> %x0, <8 x i16> %x1) optsize {
		; X86-LABEL: test_int_x86_avx512_psrlv_w_128_const:
		; X86: # %bb.0:
		; X86-NEXT: vmovdqa {{\.LCPI.*}}, %xmm0 # EVEX TO VEX Compression xmm0 = [4,4,4,4,4,4,4,65535]
		; X86-NEXT: # encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: vpsrlvw {{\.LCPI.*}}, %xmm0, %xmm0 # encoding: [0x62,0xf2,0xfd,0x08,0x10,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: retl # encoding: [0xc3]

		; X64-LABEL: test_int_x86_avx512_psrlv_w_128_const:
		; X64: # %bb.0:
		; X64-NEXT: vmovdqa {{\.LCPI.*}}(%rip), %xmm0 # EVEX TO VEX Compression xmm0 = [4,4,4,4,4,4,4,65535]
		; X64-NEXT: # encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: vpsrlvw {{\.LCPI.*}}(%rip), %xmm0, %xmm0 # encoding: [0x62,0xf2,0xfd,0x08,0x10,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: retq # encoding: [0xc3]
		%res = call <8 x i16> @llvm.x86.avx512.psrlv.w.128(
		<8 x i16> <i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 -1>,
		<8 x i16> <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 -1>)
		ret <8 x i16> %res
		}

		declare <8 x i16> @llvm.x86.avx512.psrlv.w.128(<8 x i16>, <8 x i16>)

		define <16 x i16> @test_int_x86_avx512_psrlv_w_256_const(<16 x i16> %x0, <16 x i16> %x1) optsize {
		; X86-LABEL: test_int_x86_avx512_psrlv_w_256_const:
		; X86: # %bb.0:
		; X86-NEXT: vmovdqa {{\.LCPI.*}}, %ymm0 # EVEX TO VEX Compression ymm0 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,65535]
		; X86-NEXT: # encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: vpsrlvw {{\.LCPI.*}}, %ymm0, %ymm0 # encoding: [0x62,0xf2,0xfd,0x28,0x10,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: retl # encoding: [0xc3]

		; X64-LABEL: test_int_x86_avx512_psrlv_w_256_const:
		; X64: # %bb.0:
		; X64-NEXT: vmovdqa {{\.LCPI.*}}(%rip), %ymm0 # EVEX TO VEX Compression ymm0 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,65535]
		; X64-NEXT: # encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: vpsrlvw {{\.LCPI.*}}(%rip), %ymm0, %ymm0 # encoding: [0x62,0xf2,0xfd,0x28,0x10,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: retq # encoding: [0xc3]
		%res = call <16 x i16> @llvm.x86.avx512.psrlv.w.256(
		<16 x i16> <i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 -1>,
		<16 x i16> <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 -1>)
		ret <16 x i16> %res
		}

		declare <16 x i16> @llvm.x86.avx512.psrlv.w.256(<16 x i16>, <16 x i16>)

declare <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psrav16_hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psrav16_hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; X86-LABEL: test_int_x86_avx512_mask_psrav16_hi:		; X86-LABEL: test_int_x86_avx512_mask_psrav16_hi:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: vpsravw %ymm1, %ymm0, %ymm3 # encoding: [0x62,0xf2,0xfd,0x28,0x11,0xd9]		; X86-NEXT: vpsravw %ymm1, %ymm0, %ymm3 # encoding: [0x62,0xf2,0xfd,0x28,0x11,0xd9]
; X86-NEXT: kmovw {{[0-9]+}}(%esp), %k1 # encoding: [0xc5,0xf8,0x90,0x4c,0x24,0x04]		; X86-NEXT: kmovw {{[0-9]+}}(%esp), %k1 # encoding: [0xc5,0xf8,0x90,0x4c,0x24,0x04]
; X86-NEXT: vpsravw %ymm1, %ymm0, %ymm2 {%k1} # encoding: [0x62,0xf2,0xfd,0x29,0x11,0xd1]		; X86-NEXT: vpsravw %ymm1, %ymm0, %ymm2 {%k1} # encoding: [0x62,0xf2,0xfd,0x29,0x11,0xd1]
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	; X64-NEXT: retq # encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

		define <8 x i16> @test_int_x86_avx512_psllv_w_128_const(<8 x i16> %x0, <8 x i16> %x1) optsize {
		; X86-LABEL: test_int_x86_avx512_psllv_w_128_const:
		; X86: # %bb.0:
		; X86-NEXT: vmovdqa {{\.LCPI.*}}, %xmm0 # EVEX TO VEX Compression xmm0 = [4,4,4,4,4,4,4,65535]
		; X86-NEXT: # encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: vpsllvw {{\.LCPI.*}}, %xmm0, %xmm0 # encoding: [0x62,0xf2,0xfd,0x08,0x12,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: retl # encoding: [0xc3]

		; X64-LABEL: test_int_x86_avx512_psllv_w_128_const:
		; X64: # %bb.0:
		; X64-NEXT: vmovdqa {{\.LCPI.*}}(%rip), %xmm0 # EVEX TO VEX Compression xmm0 = [4,4,4,4,4,4,4,65535]
		; X64-NEXT: # encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: vpsllvw {{\.LCPI.*}}(%rip), %xmm0, %xmm0 # encoding: [0x62,0xf2,0xfd,0x08,0x12,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}4, kind: reloc_riprel_4byte
		; X64-NEXT: retq # encoding: [0xc3]
		%res = call <8 x i16> @llvm.x86.avx512.psllv.w.128(
		<8 x i16> <i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 -1>,
		<8 x i16> <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 -1>)
		ret <8 x i16> %res
		}

		declare <8 x i16> @llvm.x86.avx512.psllv.w.128(<8 x i16>, <8 x i16>)


		define <16 x i16> @test_int_x86_avx512_psllv_w_256_const(<16 x i16> %x0, <16 x i16> %x1) optsize {
		; X86-LABEL: test_int_x86_avx512_psllv_w_256_const:
		; X86: # %bb.0:
		; X86-NEXT: vmovdqa {{\.LCPI.*}}, %ymm0 # EVEX TO VEX Compression ymm0 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,65535]
		; X86-NEXT: # encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: vpsllvw {{\.LCPI.*}}, %ymm0, %ymm0 # encoding: [0x62,0xf2,0xfd,0x28,0x12,0x05,A,A,A,A]
		; X86-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4
		; X86-NEXT: retl # encoding: [0xc3]

		; X64-LABEL: test_int_x86_avx512_psllv_w_256_const:
		; X64: # %bb.0:
		; X64-NEXT: vmovdqa {{\.LCPI.*}}(%rip), %ymm0 # EVEX TO VEX Compression ymm0 = [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,65535]
		; X64-NEXT: # encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: vpsllvw {{\.LCPI.*}}(%rip), %ymm0, %ymm0 # encoding: [0x62,0xf2,0xfd,0x28,0x12,0x05,A,A,A,A]
		; X64-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: reloc_riprel_4byte
		; X64-NEXT: retq # encoding: [0xc3]
		%res = call <16 x i16> @llvm.x86.avx512.psllv.w.256(
		<16 x i16> <i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 4, i16 -1>,
		<16 x i16> <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 -1>)
		ret <16 x i16> %res
		}

		declare <16 x i16> @llvm.x86.avx512.psllv.w.256(<16 x i16>, <16 x i16>)



declare <8 x i16> @llvm.x86.avx512.permvar.hi.128(<8 x i16>, <8 x i16>)		declare <8 x i16> @llvm.x86.avx512.permvar.hi.128(<8 x i16>, <8 x i16>)

define <8 x i16>@test_int_x86_avx512_mask_permvar_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_permvar_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; X86-LABEL: test_int_x86_avx512_mask_permvar_hi_128:		; X86-LABEL: test_int_x86_avx512_mask_permvar_hi_128:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: vpermw %xmm0, %xmm1, %xmm3 # encoding: [0x62,0xf2,0xf5,0x08,0x8d,0xd8]		; X86-NEXT: vpermw %xmm0, %xmm1, %xmm3 # encoding: [0x62,0xf2,0xf5,0x08,0x8d,0xd8]
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax # encoding: [0x0f,0xb6,0x44,0x24,0x04]		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax # encoding: [0x0f,0xb6,0x44,0x24,0x04]
; X86-NEXT: kmovd %eax, %k1 # encoding: [0xc5,0xfb,0x92,0xc8]		; X86-NEXT: kmovd %eax, %k1 # encoding: [0xc5,0xfb,0x92,0xc8]
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add X86ISD::VSHLV and X86ISD::VSRLV nodes for psllv and psrlvClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 181696

lib/Target/X86/X86ISelLowering.h

lib/Target/X86/X86ISelLowering.cpp

lib/Target/X86/X86InstrAVX512.td

lib/Target/X86/X86InstrFragmentsSIMD.td

lib/Target/X86/X86InstrSSE.td

lib/Target/X86/X86IntrinsicsInfo.h

test/CodeGen/X86/avx2-intrinsics-x86.ll

test/CodeGen/X86/avx512-intrinsics.ll

test/CodeGen/X86/avx512bw-intrinsics.ll

test/CodeGen/X86/avx512bwvl-intrinsics.ll

[X86] Add X86ISD::VSHLV and X86ISD::VSRLV nodes for psllv and psrlv
ClosedPublic