This is an archive of the discontinued LLVM Phabricator instance.

[X86][[AVX512] Code size reduction in X86 by replacing EVEX with VEX encoding
ClosedPublic

Authored by gadi.haber on Dec 18 2016, 7:08 AM.

Download Raw Diff

Details

Reviewers

zvi
delena
craig.topper
igorb
hfinkel

Commits

rG19c4fc5e6290: This is a large patch for X86 AVX-512 of an optimization for reducing code size…
rL290663: This is a large patch for X86 AVX-512 of an optimization for reducing code size…

Summary

This is a large patch for X86 AVX-512 of an optimization for reducing code size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible.

There are cases of AVX-512 instructions that have two possible encodings. This is the case with instructions that use vector registers with low indexes of 0 - 15 and do not use the zmm registers or the mask k registers.
The EVEX encoding prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes. Consequently, using the VEX encoding for these instructions results in a code size reduction of ~2 bytes even though it is compiled with the AVX-512 features enabled.

For example: “vmovss %xmm0, 32(%rsp,%rax,4)“, has the following 2 possible encodings:

EVEX encoding (8 bytes long):

62 f1 7e 08 11 44 84 08         vmovss  %xmm0, 32(%rsp,%rax,4)

VEX encoding (6 bytes long):

c5 fa 11 44 84 20                     vmovss  %xmm0, 32(%rsp,%rax,4)

Reported Bugzilla bugs related to this patch:
https://llvm.org/bugs/show_bug.cgi?id=23376
https://llvm.org/bugs/show_bug.cgi?id=29162

In this patch we created a new pass called createX86EvexToVexInsts at the pre-emit stage which uses a table of all EVEX opcodes that can be encoded via VEX.
The table is placed in a separate header file along with the above pass file under lib/Target/X86.

The patch requires many modifications to CodeGen/X86 unit tests since a new string: "EVEX TO VEX Compression " was added to the encoding string of each optimized instruction. The string is printed whenever the llc --show-mc-encoding flag is applied.
Finally, an additional MIR test file was added to the CodeGen/X86 unit tests called: "evex-to-vex.mir" containing all the EVEX instructions that are handled by this optimization.

Diff Detail

Repository: rL LLVM

Event Timeline

gadi.haber updated this revision to Diff 81882.Dec 18 2016, 7:08 AM

gadi.haber retitled this revision from to [X86][[AVX512] Code size reduction in X86 by replacing EVEX with VEX encoding.

gadi.haber updated this object.

gadi.haber added reviewers: zvi, delena, craig.topper, igorb.

gadi.haber set the repository for this revision to rL LLVM.

gadi.haber added a subscriber: llvm-commits.

Herald added a subscriber: mgorny. · View Herald TranscriptDec 18 2016, 7:09 AM

An RFC about the patch was submitted in LLVM google forum at:
https://groups.google.com/forum/#!msg/llvm-dev/rvxTO1Sr3T0/Ct5z_lYXAwAJ;context-place=msg/llvm-dev/bC2DmMgBwJE/irxxyhJ_DgAJ

gadi.haber added a reviewer: hfinkel.Dec 18 2016, 8:00 AM

craig.topper added inline comments.Dec 18 2016, 11:05 AM

lib/Target/X86/InstPrinter/X86InstComments.h
20 ↗	(On Diff #81882)	The other bit (bit 0) used for AsmComments is target independent. How do we keep someone from adding a new target independent bit 1 and conflicting with this usage? We should have some documentation and probably an enum contant in MachineInstr.h that indicates which bits can be used for target dependent things.
lib/Target/X86/X86EvexToVex.cpp
54 ↗	(On Diff #81882)	The second line is indented too far.
125 ↗	(On Diff #81882)	This line should be indented further to line up arguments.
151 ↗	(On Diff #81882)	This should be (Desc.TSFlags & X86II::EncodingMask) == X86II::EVEX) X86II::EVEX is not a bit mask its a value for the encoding field.
159 ↗	(On Diff #81882)	Variable names should start with a capital letter.
162 ↗	(On Diff #81882)	Comments in LLVM are generally written as sentences starting with a capitalized first word and ending with a period.
168 ↗	(On Diff #81882)	What do we get out of having two different DenseMaps? Couldn't we have 128 and 256 in the same map?
206 ↗	(On Diff #81882)	Can we use TargetRegisterInfo::getRegEncoding() to simplify this?
224 ↗	(On Diff #81882)	Remove space before parentheses
lib/Target/X86/X86MCInstLower.cpp
1294 ↗	(On Diff #81882)	EVE should be EVEX
lib/Target/X86/X86TargetMachine.cpp
403 ↗	(On Diff #81882)	Fix the indentation here.

zvi added inline comments.Dec 19 2016, 7:50 AM

lib/MC/MCAsmStreamer.cpp
303 ↗	(On Diff #81882)	Have you considered using GetCommentOS() or EmitRawComment() before modifying this method?
308 ↗	(On Diff #81882)	The 'by default' comment should be placed where EOL is initialized by default to true?
lib/Target/X86/X86EvexToVex.cpp
1 ↗	(On Diff #81882)	Can you please run clang-format on this file? it will fix some of Craig's comments about indentation and perhaps a few more glitches.
105 ↗	(On Diff #81882)	What if MF is empty?
108 ↗	(On Diff #81882)	Please remove this DEBUG() macro and the one in line 118 as they don't contribute much.
112 ↗	(On Diff #81882)	Please consider: for (MachineBasicBlock &MBB : F) rc = ....
130 ↗	(On Diff #81882)	*encoding
142 ↗	(On Diff #81882)	There is a convention in other passes to name this variable 'Changed'
159 ↗	(On Diff #81882)	Consider dropping this variable since it's used only in the following statement.
198 ↗	(On Diff #81882)	consider this: for (const MachineOperand &MO : I->operands()) {
222 ↗	(On Diff #81882)	if you change the loop header in line 145 to the following you won't need this extra variable: for (MachineInstr &MI : MBB) {
test/CodeGen/X86/evex-to-vex-compress.mir
17 ↗	(On Diff #81882)	It's hard to follow this enormous test. Can you please put the '# CHECK:' line above and adjacent to its corresponding input instruction?

gadi.haber marked 7 inline comments as done.Dec 19 2016, 7:55 AM

gadi.haber added inline comments.

lib/Target/X86/InstPrinter/X86InstComments.h
20 ↗	(On Diff #81882)	I'm open to suggestions on this. One option is to simply add the following comment in MachineInstr.h: enum CommentFlag { ReloadReuse = 0x1 // higher bits are reserved for target dep comments. };
lib/Target/X86/X86EvexToVex.cpp
151 ↗	(On Diff #81882)	Good catch. Thanx! You probably meant: (Desc.TSFlags & X86II::EncodingMask) != X86II::EVEX
168 ↗	(On Diff #81882)	The smaller the DenseMap table is the faster the find() method returns a value. It is therefore, recommended to chunk the DenseMap table to smaller tables whenever possible.
206 ↗	(On Diff #81882)	I can simplify the code by defining a lambda function called isAVX512RegOnly(unsigned Reg) and use it to check on the Reg operand to be between X86::ZMM0 and X86::ZMM31 or between X86::YMM16 and X86::YMM31 or between X86:XMM16 and X86::XMM31 as follows: auto isAVX512RegOnly = [](unsigned Reg) { if (Reg >= X86::ZMM0 && Reg <= X86::ZMM31) return true; if (Reg >= X86::XMM16 && Reg <= X86::XMM31) return true; if (Reg >= X86::YMM16 && Reg <= X86::YMM31) return true; return false; };

craig.topper added inline comments.Dec 19 2016, 9:37 PM

lib/Target/X86/X86EvexToVex.cpp
206 ↗	(On Diff #81882)	Is it even possible to get a VR512 on any of the instructions in the table? The register class is tightly bound to the instruction. For XMM/YMM, getRegEncoding would return 0-31 and you can just check that.

delena added inline comments.Dec 19 2016, 11:10 PM

lib/Target/X86/X86EvexToVex.cpp
206 ↗	(On Diff #81882)	I can't find any getRegEncoding() method, do you mean to write a new one?

craig.topper added inline comments.Dec 19 2016, 11:24 PM

lib/Target/X86/X86EvexToVex.cpp
206 ↗	(On Diff #81882)	Sorry, it's getEncodingValue().

gadi.haber marked 6 inline comments as done.Dec 19 2016, 11:42 PM

gadi.haber added inline comments.

lib/MC/MCAsmStreamer.cpp
303 ↗	(On Diff #81882)	Yes, but they did not generate a concatenated comment.
lib/Target/X86/X86EvexToVex.cpp
130 ↗	(On Diff #81882)	Did you mean removing the comma that follows encoding?

craig.topper added inline comments.Dec 20 2016, 12:08 AM

lib/Target/X86/X86EvexToVex.cpp
47 ↗	(On Diff #81882)	This also says "encodig"
130 ↗	(On Diff #81882)	The first line of the comment says "encodig" rather than "encoding"

gadi.haber marked 2 inline comments as done.Dec 20 2016, 12:47 AM

gadi.haber marked 2 inline comments as done.Dec 20 2016, 1:03 AM

gadi.haber added inline comments.

lib/Target/X86/X86EvexToVex.cpp
105 ↗	(On Diff #81882)	Added a check: if (!MF) return false;
206 ↗	(On Diff #81882)	True. There shouldn't be any VR512 registers in the EVEX to VEX tables. This is an additional safety check.

Updated diff following review comments by Craig and Zvi.

zvi added inline comments.Dec 21 2016, 12:57 AM

lib/Target/X86/X86EvexToVex.cpp
100 ↗	(On Diff #82128)	Maybe bail-out early if subtarget does not support avx512? const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>(); if (!ST.hasAVX512()) return false;
133 ↗	(On Diff #82128)	Please remove the comment
141 ↗	(On Diff #82128)	I think it would be helpful if the comment said that in these cases the transformation is not possible because EVEX is needed to carry this information.
199 ↗	(On Diff #82128)	An improvement may be to iterate only over MI.explicit_operands()

gadi.haber marked 3 inline comments as done.Dec 21 2016, 3:15 AM

gadi.haber added inline comments.

lib/Target/X86/X86EvexToVex.cpp
141 ↗	(On Diff #82128)	Updated comment: // Check for EVEX instructions with mask or broadcast as in these cases // the EVEX prefix is needed in order to carry this information // thus preventing the transformation to VEX encoding.

craig.topper added inline comments.Dec 21 2016, 9:14 AM

lib/Target/X86/X86EvexToVex.cpp
196 ↗	(On Diff #82128)	I think this comment should say "16" instead of "6".
206 ↗	(On Diff #81882)	If it shouldn't happen, shouldn't it just be an assert?

gadi.haber marked 4 inline comments as done.Dec 21 2016, 12:22 PM

gadi.haber added inline comments.

lib/Target/X86/X86EvexToVex.cpp
196 ↗	(On Diff #82128)	Good catch! Thanx!
206 ↗	(On Diff #81882)	good point. replaced the check with the following assert: assert (!(Reg >= X86::ZMM0 && Reg <= X86::ZMM31));

Updated diff after additional comments by Craig and Zvi

craig.topper added inline comments.Dec 23 2016, 6:23 PM

lib/Target/X86/X86EvexToVex.cpp
111 ↗	(On Diff #82341)	Don't we need to OR into rc? This just captures the last returned value.

gadi.haber marked an inline comment as done.Dec 24 2016, 11:27 PM

gadi.haber added inline comments.

lib/Target/X86/X86EvexToVex.cpp

111 ↗

(On Diff #82341)

Good catch!
Here is the modified code:

bool Changed = false;

  /// Go over all basic blocks in function and replace
  /// EVEX encoded instrs by VEX encoding when possible.
  for (MachineBasicBlock &MBB : MF)
    Changed |= CompressEvexToVexImpl(MBB);

  return Changed;

zvi added inline comments.Dec 25 2016, 12:21 AM

lib/Target/X86/X86EvexToVex.cpp
49 ↗	(On Diff #82341)	Since the transform of an individual instruction is independant of other instructions in the BB, i think it would be better to reduce the scope of this function from a MBB to a MachineInstr. (So the loop on the MBB can be moved to the caller of this function)
219 ↗	(On Diff #82341)	Now that this API takes also target-specific flags, I think that instead of casting, setAsmPrinterFlag(CommentFlag) (and friends) should be changed to setAsmPrinterFlag(uint8_t) or something similar.

gadi.haber marked 2 inline comments as done.Dec 25 2016, 5:50 AM

gadi.haber added inline comments.

lib/Target/X86/X86EvexToVex.cpp
219 ↗	(On Diff #82341)	There is only single definition of setAsmPrinterFlag in MachineInstr.h and it receives the CommenFlag parameter.

More updates to X86EvexToVex.cpp following comments by Zvi and Craig.

zvi added inline comments.Dec 25 2016, 7:09 AM

lib/Target/X86/X86EvexToVex.cpp
118 ↗	(On Diff #82470)	I think these 3 if's belong in CompressEvexToVexImpl()
165 ↗	(On Diff #82470)	assert here that (IsEVEX_V128 XOR IsEVEX_V256) ?
165 ↗	(On Diff #82470)	Please ignore my other comment on this line, Phabricator won't let me remove it.

updated diff after one comment by Zvi

gadi.haber marked an inline comment as done.Dec 27 2016, 7:48 AM

gadi.haber added inline comments.

lib/Target/X86/X86EvexToVex.cpp
219 ↗	(On Diff #82341)	OK. I understand your comment now. I changed the definition of SetAsmPrinterFlag(CommentFlag) to SetAsmPrinterFlag(unit8_t) in MachineInstr.h

Updated diff file after comment by Zvi on changing the setAsmPrinterFlag to receive unit8_t instead of CommentFlag.

updated diff after Zvi's comment on setAsmPrinterFlag + fixed typo in the enum AC_EVEX_2_VEX.

LGTM. Thanks, Gadi.

This revision is now accepted and ready to land.Dec 27 2016, 8:43 AM

Closed by commit rL290663: This is a large patch for X86 AVX-512 of an optimization for reducing code size… (authored by gadi.haber). · Explain WhyDec 28 2016, 2:23 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

MachineInstr.h

6 lines

MC/

MCStreamer.h

6 lines

lib/

MC/

MCAsmStreamer.cpp

10 lines

Target/

X86/

CMakeLists.txt

1 line

InstPrinter/

5 lines

7 lines

213 lines

1148 lines

8 lines

2 lines

test/

CodeGen/

X86/

avx-intrinsics-x86.ll

238 lines

avx2-intrinsics-x86.ll

160 lines

20 lines

8 lines

12 lines

12 lines

avx512-gather-scatter-intrin.ll

10 lines

avx512-mask-op.ll

4 lines

avx512-masked_memop-16-8.ll

2 lines

avx512-mov.ll

28 lines

avx512-scalar.ll

14 lines

avx512-vbroadcasti128.ll

14 lines

avx512-vbroadcasti256.ll

8 lines

avx512-vec-cmp.ll

4 lines

avx512bwvl-intrinsics-upgrade.ll

328 lines

avx512bwvl-intrinsics.ll

862 lines

avx512bwvl-mov.ll

32 lines

avx512dqvl-intrinsics-upgrade.ll

92 lines

avx512dqvl-intrinsics.ll

104 lines

avx512ifmavl-intrinsics.ll

16 lines

avx512vbmivl-intrinsics.ll

48 lines

avx512vl-intrinsics-upgrade.ll

840 lines

avx512vl-intrinsics.ll

542 lines

avx512vl-logic.ll

24 lines

avx512vl-mov.ll

128 lines

avx512vl-nontemporal.ll

12 lines

avx512vl-vbroadcast.ll

12 lines

compress_expand.ll

6 lines

evex-to-vex-compress.mir

4485 lines

fast-isel-store.ll

4 lines

fp-logic-replace.ll

12 lines

masked_gather_scatter.ll

24 lines

masked_memop.ll

30 lines

nontemporal-2.ll

24 lines

sse-intrinsics-x86.ll

34 lines

sse2-intrinsics-x86.ll

128 lines

sse41-intrinsics-x86.ll

20 lines

sse42-intrinsics-x86.ll

4 lines

ssse3-intrinsics-x86.ll

12 lines

subvector-broadcast.ll

158 lines

vec-copysign-avx512.ll

12 lines

4 lines

4 lines

28 lines

16 lines

vector-half-conversions.ll

16 lines

vector-lzcnt-256.ll

4 lines

vector-shuffle-128-v16.ll

28 lines

vector-shuffle-128-v2.ll

12 lines

vector-shuffle-128-v4.ll

10 lines

vector-shuffle-128-v8.ll

14 lines

vector-shuffle-256-v16.ll

140 lines

vector-shuffle-256-v32.ll

32 lines

vector-shuffle-256-v8.ll

54 lines

vector-shuffle-combining-avx512bwvl.ll

16 lines

vector-shuffle-combining-avx512vbmi.ll

24 lines

vector-shuffle-masked.ll

16 lines

12 lines

82 lines

80 lines

4 lines

Diff 82586

llvm/trunk/include/llvm/CodeGen/MachineInstr.h

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
public:		public:
typedef MachineMemOperand **mmo_iterator;		typedef MachineMemOperand **mmo_iterator;

/// Flags to specify different kinds of comments to output in		/// Flags to specify different kinds of comments to output in
/// assembly code. These flags carry semantic information not		/// assembly code. These flags carry semantic information not
/// otherwise easily derivable from the IR text.		/// otherwise easily derivable from the IR text.
///		///
enum CommentFlag {		enum CommentFlag {
ReloadReuse = 0x1		ReloadReuse = 0x1 // higher bits are reserved for target dep comments.
};		};

enum MIFlag {		enum MIFlag {
NoFlags = 0,		NoFlags = 0,
FrameSetup = 1 << 0, // Instruction is used as a part of		FrameSetup = 1 << 0, // Instruction is used as a part of
// function frame setup code.		// function frame setup code.
FrameDestroy = 1 << 1, // Instruction is used as a part of		FrameDestroy = 1 << 1, // Instruction is used as a part of
// function frame destruction code.		// function frame destruction code.
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	public:
void clearAsmPrinterFlags() { AsmPrinterFlags = 0; }		void clearAsmPrinterFlags() { AsmPrinterFlags = 0; }

/// Return whether an AsmPrinter flag is set.		/// Return whether an AsmPrinter flag is set.
bool getAsmPrinterFlag(CommentFlag Flag) const {		bool getAsmPrinterFlag(CommentFlag Flag) const {
return AsmPrinterFlags & Flag;		return AsmPrinterFlags & Flag;
}		}

/// Set a flag for the AsmPrinter.		/// Set a flag for the AsmPrinter.
void setAsmPrinterFlag(CommentFlag Flag) {		void setAsmPrinterFlag(uint8_t Flag) {
AsmPrinterFlags \|= (uint8_t)Flag;		AsmPrinterFlags \|= Flag;
}		}

/// Clear specific AsmPrinter flags.		/// Clear specific AsmPrinter flags.
void clearAsmPrinterFlag(CommentFlag Flag) {		void clearAsmPrinterFlag(CommentFlag Flag) {
AsmPrinterFlags &= ~Flag;		AsmPrinterFlags &= ~Flag;
}		}

/// Return the MI flags bitvector.		/// Return the MI flags bitvector.
▲ Show 20 Lines • Show All 1,150 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/MC/MCStreamer.h

Show First 20 Lines • Show All 256 Lines • ▼ Show 20 Lines	public:
///		///
/// Typically for comments that can be emitted to the generated .s		/// Typically for comments that can be emitted to the generated .s
/// file if applicable as a QoI issue to make the output of the compiler		/// file if applicable as a QoI issue to make the output of the compiler
/// more readable. This only affects the MCAsmStreamer, and only when		/// more readable. This only affects the MCAsmStreamer, and only when
/// verbose assembly output is enabled.		/// verbose assembly output is enabled.
///		///
/// If the comment includes embedded \n's, they will each get the comment		/// If the comment includes embedded \n's, they will each get the comment
/// prefix as appropriate. The added comment should not end with a \n.		/// prefix as appropriate. The added comment should not end with a \n.
virtual void AddComment(const Twine &T) {}		/// By default, each comment is terminated with an end of line, i.e. the
		/// EOL param is set to true by default. If one prefers not to end the
		/// comment with a new line then the EOL param should be passed
		/// with a false value.
		virtual void AddComment(const Twine &T, bool EOL = true) {}

/// \brief Return a raw_ostream that comments can be written to. Unlike		/// \brief Return a raw_ostream that comments can be written to. Unlike
/// AddComment, you are required to terminate comments with \n if you use this		/// AddComment, you are required to terminate comments with \n if you use this
/// method.		/// method.
virtual raw_ostream &GetCommentOS();		virtual raw_ostream &GetCommentOS();

/// \brief Print T and prefix it with the comment string (normally #) and		/// \brief Print T and prefix it with the comment string (normally #) and
/// optionally a tab. This prints the comment immediately, not at the end of		/// optionally a tab. This prints the comment immediately, not at the end of
▲ Show 20 Lines • Show All 604 Lines • Show Last 20 Lines

llvm/trunk/lib/MC/MCAsmStreamer.cpp

Show First 20 Lines • Show All 94 Lines • ▼ Show 20 Lines	public:

/// hasRawTextSupport - We support EmitRawText.		/// hasRawTextSupport - We support EmitRawText.
bool hasRawTextSupport() const override { return true; }		bool hasRawTextSupport() const override { return true; }

/// AddComment - Add a comment that can be emitted to the generated .s		/// AddComment - Add a comment that can be emitted to the generated .s
/// file if applicable as a QoI issue to make the output of the compiler		/// file if applicable as a QoI issue to make the output of the compiler
/// more readable. This only affects the MCAsmStreamer, and only when		/// more readable. This only affects the MCAsmStreamer, and only when
/// verbose assembly output is enabled.		/// verbose assembly output is enabled.
void AddComment(const Twine &T) override;		void AddComment(const Twine &T, bool EOL = true) override;

/// AddEncodingComment - Add a comment showing the encoding of an instruction.		/// AddEncodingComment - Add a comment showing the encoding of an instruction.
void AddEncodingComment(const MCInst &Inst, const MCSubtargetInfo &);		void AddEncodingComment(const MCInst &Inst, const MCSubtargetInfo &);

/// GetCommentOS - Return a raw_ostream that comments can be written to.		/// GetCommentOS - Return a raw_ostream that comments can be written to.
/// Unlike AddComment, you are required to terminate comments with \n if you		/// Unlike AddComment, you are required to terminate comments with \n if you
/// use this method.		/// use this method.
raw_ostream &GetCommentOS() override {		raw_ostream &GetCommentOS() override {
▲ Show 20 Lines • Show All 184 Lines • ▼ Show 20 Lines
};		};

} // end anonymous namespace.		} // end anonymous namespace.

/// AddComment - Add a comment that can be emitted to the generated .s		/// AddComment - Add a comment that can be emitted to the generated .s
/// file if applicable as a QoI issue to make the output of the compiler		/// file if applicable as a QoI issue to make the output of the compiler
/// more readable. This only affects the MCAsmStreamer, and only when		/// more readable. This only affects the MCAsmStreamer, and only when
/// verbose assembly output is enabled.		/// verbose assembly output is enabled.
void MCAsmStreamer::AddComment(const Twine &T) {		/// By deafult EOL is set to true so that each comment goes on its own line.
		void MCAsmStreamer::AddComment(const Twine &T, bool EOL) {
if (!IsVerboseAsm) return;		if (!IsVerboseAsm) return;

T.toVector(CommentToEmit);		T.toVector(CommentToEmit);
// Each comment goes on its own line.
CommentToEmit.push_back('\n');		if (EOL)
		CommentToEmit.push_back('\n'); // Place comment in a new line.
}		}

void MCAsmStreamer::EmitCommentsAndEOL() {		void MCAsmStreamer::EmitCommentsAndEOL() {
if (CommentToEmit.empty() && CommentStream.GetNumBytesInBuffer() == 0) {		if (CommentToEmit.empty() && CommentStream.GetNumBytesInBuffer() == 0) {
OS << '\n';		OS << '\n';
return;		return;
}		}

▲ Show 20 Lines • Show All 1,356 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/CMakeLists.txt

Show All 34 Lines	set(sources
X86FixupSetCC.cpp		X86FixupSetCC.cpp
X86FloatingPoint.cpp		X86FloatingPoint.cpp
X86FrameLowering.cpp		X86FrameLowering.cpp
X86ISelDAGToDAG.cpp		X86ISelDAGToDAG.cpp
X86ISelLowering.cpp		X86ISelLowering.cpp
X86InterleavedAccess.cpp		X86InterleavedAccess.cpp
X86InstrFMA3Info.cpp		X86InstrFMA3Info.cpp
X86InstrInfo.cpp		X86InstrInfo.cpp
		X86EvexToVex.cpp
X86MCInstLower.cpp		X86MCInstLower.cpp
X86MachineFunctionInfo.cpp		X86MachineFunctionInfo.cpp
X86OptimizeLEAs.cpp		X86OptimizeLEAs.cpp
X86PadShortFunction.cpp		X86PadShortFunction.cpp
X86RegisterInfo.cpp		X86RegisterInfo.cpp
X86SelectionDAGInfo.cpp		X86SelectionDAGInfo.cpp
X86ShuffleDecodeConstantPool.cpp		X86ShuffleDecodeConstantPool.cpp
X86Subtarget.cpp		X86Subtarget.cpp
Show All 18 Lines

llvm/trunk/lib/Target/X86/InstPrinter/X86InstComments.h

	Show All 10 Lines
	// an output stream for -fverbose-asm.			// an output stream for -fverbose-asm.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_LIB_TARGET_X86_INSTPRINTER_X86INSTCOMMENTS_H			#ifndef LLVM_LIB_TARGET_X86_INSTPRINTER_X86INSTCOMMENTS_H
	#define LLVM_LIB_TARGET_X86_INSTPRINTER_X86INSTCOMMENTS_H			#define LLVM_LIB_TARGET_X86_INSTPRINTER_X86INSTCOMMENTS_H

	namespace llvm {			namespace llvm {

				enum AsmComments {
				AC_EVEX_2_VEX = 0x2 // For instr that was compressed from EVEX to VEX.
				};

	class MCInst;			class MCInst;
	class raw_ostream;			class raw_ostream;
	bool EmitAnyX86InstComments(const MCInst *MI, raw_ostream &OS,			bool EmitAnyX86InstComments(const MCInst *MI, raw_ostream &OS,
	const char (getRegName)(unsigned));			const char (getRegName)(unsigned));
	}			}

	#endif			#endif

llvm/trunk/lib/Target/X86/X86.h

	Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines

	/// Return a Machine IR pass that selectively replaces			/// Return a Machine IR pass that selectively replaces
	/// certain byte and word instructions by equivalent 32 bit instructions,			/// certain byte and word instructions by equivalent 32 bit instructions,
	/// in order to eliminate partial register usage, false dependences on			/// in order to eliminate partial register usage, false dependences on
	/// the upper portions of registers, and to save code size.			/// the upper portions of registers, and to save code size.
	FunctionPass *createX86FixupBWInsts();			FunctionPass *createX86FixupBWInsts();

	void initializeFixupBWInstPassPass(PassRegistry &);			void initializeFixupBWInstPassPass(PassRegistry &);

				/// This pass replaces EVEX ecnoded of AVX-512 instructiosn by VEX
				/// encoding when possible in order to reduce code size.
				FunctionPass *createX86EvexToVexInsts();

				void initializeEvexToVexInstPassPass(PassRegistry &);

	} // End llvm namespace			} // End llvm namespace

	#endif			#endif

llvm/trunk/lib/Target/X86/X86EvexToVex.cpp

Property	Old Value	New Value
svn:eol-style	null	native
svn:executable	null	*
svn:keywords	null	Author Date Id Rev URL
svn:mime-type	null	text/plain

				//===----------------------- X86EvexToVex.cpp ----------------------------===//
				// Compress EVEX instructions to VEX encoding when possible to reduce code size
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===---------------------------------------------------------------------===//
				/// \file
				/// This file defines the pass that goes over all AVX-512 instructions which
				/// are encoded using the EVEX prefix and if possible replaces them by their
				/// corresponding VEX encoding which is usually shorter by 2 bytes.
				/// EVEX instructions may be encoded via the VEX prefix when the AVX-512
				/// instruction has a corresponding AVX/AVX2 opcode and when it does not
				/// use the xmm or the mask registers or xmm/ymm registers wuith indexes
				/// higher than 15.
				/// The pass applies code reduction on the generated code for AVX-512 instrs.
				///
				//===---------------------------------------------------------------------===//

				#include "InstPrinter/X86InstComments.h"
				#include "X86.h"
				#include "X86InstrBuilder.h"
				#include "X86InstrInfo.h"
				#include "X86InstrTablesInfo.h"
				#include "X86MachineFunctionInfo.h"
				#include "X86Subtarget.h"
				#include "X86TargetMachine.h"

				using namespace llvm;

				#define EVEX2VEX_DESC "Compressing EVEX instrs to VEX encoding when possible"
				#define EVEX2VEX_NAME "x86-evex-to-vex-compress"

				#define DEBUG_TYPE EVEX2VEX_NAME

				namespace {

				class EvexToVexInstPass : public MachineFunctionPass {

				/// X86EvexToVexCompressTable - Evex to Vex encoding opcode map.
				typedef DenseMap<unsigned, uint16_t> EvexToVexTableType;
				EvexToVexTableType EvexToVex128Table;
				EvexToVexTableType EvexToVex256Table;

				/// For EVEX instructions that can be encoded using VEX encoding, replace
				/// them by the VEX encoding in order to reduce size.
				bool CompressEvexToVexImpl(MachineInstr &MI) const;

				/// For initializing the hash map tables of all AVX-512 EVEX
				/// corresponding to AVX/AVX2 opcodes.
				void AddTableEntry(EvexToVexTableType &EvexToVexTable, uint16_t EvexOp,
				uint16_t VexOp);

				public:
				static char ID;

				StringRef getPassName() const override { return EVEX2VEX_DESC; }

				EvexToVexInstPass() : MachineFunctionPass(ID) {
				initializeEvexToVexInstPassPass(*PassRegistry::getPassRegistry());

				// Initialize the EVEX to VEX 128 table map.
				for (X86EvexToVexCompressTableEntry Entry : X86EvexToVex128CompressTable) {
				AddTableEntry(EvexToVex128Table, Entry.EvexOpcode, Entry.VexOpcode);
				}

				// Initialize the EVEX to VEX 256 table map.
				for (X86EvexToVexCompressTableEntry Entry : X86EvexToVex256CompressTable) {
				AddTableEntry(EvexToVex256Table, Entry.EvexOpcode, Entry.VexOpcode);
				}
				}

				/// Loop over all of the basic blocks, replacing EVEX instructions
				/// by equivalent VEX instructions when possible for reducing code size.
				bool runOnMachineFunction(MachineFunction &MF) override;

				// This pass runs after regalloc and doesn't support VReg operands.
				MachineFunctionProperties getRequiredProperties() const override {
				return MachineFunctionProperties().set(
				MachineFunctionProperties::Property::NoVRegs);
				}

				private:
				/// Machine instruction info used throughout the class.
				const X86InstrInfo *TII;
				};

				char EvexToVexInstPass::ID = 0;
				}

				INITIALIZE_PASS(EvexToVexInstPass, EVEX2VEX_NAME, EVEX2VEX_DESC, false, false)

				FunctionPass *llvm::createX86EvexToVexInsts() {
				return new EvexToVexInstPass();
				}

				bool EvexToVexInstPass::runOnMachineFunction(MachineFunction &MF) {
				TII = MF.getSubtarget<X86Subtarget>().getInstrInfo();

				const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();
				if (!ST.hasAVX512())
				return false;

				bool Changed = false;

				/// Go over all basic blocks in function and replace
				/// EVEX encoded instrs by VEX encoding when possible.
				for (MachineBasicBlock &MBB : MF) {

				// Traverse the basic block.
				for (MachineInstr &MI : MBB)
				Changed \|= CompressEvexToVexImpl(MI);
				}

				return Changed;
				}

				void EvexToVexInstPass::AddTableEntry(EvexToVexTableType &EvexToVexTable,
				uint16_t EvexOp, uint16_t VexOp) {
				EvexToVexTable[EvexOp] = VexOp;
				}

				// For EVEX instructions that can be encoded using VEX encoding
				// replace them by the VEX encoding in order to reduce size.
				bool EvexToVexInstPass::CompressEvexToVexImpl(MachineInstr &MI) const {

				// VEX format.
				// # of bytes: 0,2,3 1 1 0,1 0,1,2,4 0,1
				// [Prefixes] [VEX] OPCODE ModR/M [SIB] [DISP] [IMM]
				//
				// EVEX format.
				// # of bytes: 4 1 1 1 4 / 1 1
				// [Prefixes] EVEX Opcode ModR/M [SIB] [Disp32] / [Disp8*N] [Immediate]

				const MCInstrDesc &Desc = MI.getDesc();

				// Check for EVEX instructions only.
				if ((Desc.TSFlags & X86II::EncodingMask) != X86II::EVEX)
				return false;

				// Check for EVEX instructions with mask or broadcast as in these cases
				// the EVEX prefix is needed in order to carry this information
				// thus preventing the transformation to VEX encoding.
				if (Desc.TSFlags & (X86II::EVEX_K \| X86II::EVEX_B))
				return false;

				// Check for non EVEX_V512 instrs only.
				// EVEX_V512 instr: bit EVEX_L2 = 1; bit VEX_L = 0.
				if ((Desc.TSFlags & X86II::EVEX_L2) && !(Desc.TSFlags & X86II::VEX_L))
				return false;

				// EVEX_V128 instr: bit EVEX_L2 = 0, bit VEX_L = 0.
				bool IsEVEX_V128 =
				(!(Desc.TSFlags & X86II::EVEX_L2) && !(Desc.TSFlags & X86II::VEX_L));

				// EVEX_V256 instr: bit EVEX_L2 = 0, bit VEX_L = 1.
				bool IsEVEX_V256 =
				(!(Desc.TSFlags & X86II::EVEX_L2) && (Desc.TSFlags & X86II::VEX_L));

				unsigned NewOpc = 0;

				// Check for EVEX_V256 instructions.
				if (IsEVEX_V256) {
				// Search for opcode in the EvexToVex256 table.
				auto It = EvexToVex256Table.find(MI.getOpcode());
				if (It != EvexToVex256Table.end())
				NewOpc = It->second;
				}

				// Check for EVEX_V128 or Scalar instructions.
				else if (IsEVEX_V128) {
				// Search for opcode in the EvexToVex128 table.
				auto It = EvexToVex128Table.find(MI.getOpcode());
				if (It != EvexToVex128Table.end())
				NewOpc = It->second;
				}

				if (!NewOpc)
				return false;

				auto isHiRegIdx = [](unsigned Reg) {
				// Check for XMM register with indexes between 16 - 31.
				if (Reg >= X86::XMM16 && Reg <= X86::XMM31)
				return true;

				// Check for YMM register with indexes between 16 - 31.
				if (Reg >= X86::YMM16 && Reg <= X86::YMM31)
				return true;

				return false;
				};

				// Check that operands are not ZMM regs or
				// XMM/YMM regs with hi indexes between 16 - 31.
				for (const MachineOperand &MO : MI.explicit_operands()) {
				if (!MO.isReg())
				continue;

				unsigned Reg = MO.getReg();

				assert (!(Reg >= X86::ZMM0 && Reg <= X86::ZMM31));

				if (isHiRegIdx(Reg))
				return false;
				}

				const MCInstrDesc &MCID = TII->get(NewOpc);
				MI.setDesc(MCID);
				MI.setAsmPrinterFlag(AC_EVEX_2_VEX);
				return true;
				}

llvm/trunk/lib/Target/X86/X86InstrTablesInfo.h

Property	Old Value	New Value
svn:eol-style	null	native
svn:executable	null	*
svn:keywords	null	Author Date Id Rev URL
svn:mime-type	null	text/plain

				//===-- X86AVX512Info.h - X86 Instruction Tables Information ----- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file contains related X86 Instruction Information Tables.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIB_TARGET_X86_X86INSTRTABLESINFO_H
				#define LLVM_LIB_TARGET_X86_X86INSTRTABLESINFO_H

				using namespace llvm;

				struct X86EvexToVexCompressTableEntry {
				uint16_t EvexOpcode;
				uint16_t VexOpcode;
				};



				// X86 EVEX encoded instructions that have a VEX 128 encoding
				// (table format: <EVEX opcode, VEX-128 opcode>).
				static const X86EvexToVexCompressTableEntry
				X86EvexToVex128CompressTable[] = {
				// EVEX scalar with corresponding VEX.
				{ X86::Int_VCOMISDZrm , X86::Int_VCOMISDrm },
				{ X86::Int_VCOMISDZrr , X86::Int_VCOMISDrr },
				{ X86::Int_VCOMISSZrm , X86::Int_VCOMISSrm },
				{ X86::Int_VCOMISSZrr , X86::Int_VCOMISSrr },
				{ X86::Int_VUCOMISDZrm , X86::Int_VUCOMISDrm },
				{ X86::Int_VUCOMISDZrr , X86::Int_VUCOMISDrr },
				{ X86::Int_VUCOMISSZrm , X86::Int_VUCOMISSrm },
				{ X86::Int_VUCOMISSZrr , X86::Int_VUCOMISSrr },
				{ X86::VADDSDZrm , X86::VADDSDrm },
				{ X86::VADDSDZrm_Int , X86::VADDSDrm_Int },
				{ X86::VADDSDZrr , X86::VADDSDrr },
				{ X86::VADDSDZrr_Int , X86::VADDSDrr_Int },
				{ X86::VADDSSZrm , X86::VADDSSrm },
				{ X86::VADDSSZrm_Int , X86::VADDSSrm_Int },
				{ X86::VADDSSZrr , X86::VADDSSrr },
				{ X86::VADDSSZrr_Int , X86::VADDSSrr_Int },
				{ X86::VCOMISDZrm , X86::VCOMISDrm },
				{ X86::VCOMISDZrr , X86::VCOMISDrr },
				{ X86::VCOMISSZrm , X86::VCOMISSrm },
				{ X86::VCOMISSZrr , X86::VCOMISSrr },
				{ X86::VCVTSD2SI64Zrm , X86::VCVTSD2SI64rm },
				{ X86::VCVTSD2SI64Zrr , X86::VCVTSD2SI64rr },
				{ X86::VCVTSD2SIZrm , X86::VCVTSD2SIrm },
				{ X86::VCVTSD2SIZrr , X86::VCVTSD2SIrr },
				{ X86::VCVTSD2SSZrm , X86::VCVTSD2SSrm },
				{ X86::VCVTSD2SSZrr , X86::VCVTSD2SSrr },
				{ X86::VCVTSI2SDZrm , X86::VCVTSI2SDrm },
				{ X86::VCVTSI2SDZrm_Int , X86::Int_VCVTSI2SDrm },
				{ X86::VCVTSI2SDZrr , X86::VCVTSI2SDrr },
				{ X86::VCVTSI2SDZrr_Int , X86::Int_VCVTSI2SDrr },
				{ X86::VCVTSI2SSZrm , X86::VCVTSI2SSrm },
				{ X86::VCVTSI2SSZrm_Int , X86::Int_VCVTSI2SSrm },
				{ X86::VCVTSI2SSZrr , X86::VCVTSI2SSrr },
				{ X86::VCVTSI2SSZrr_Int , X86::Int_VCVTSI2SSrr },
				{ X86::VCVTSS2SDZrm , X86::VCVTSS2SDrm },
				{ X86::VCVTSS2SDZrr , X86::VCVTSS2SDrr },
				{ X86::VCVTSS2SI64Zrm , X86::VCVTSS2SI64rm },
				{ X86::VCVTSS2SI64Zrr , X86::VCVTSS2SI64rr },
				{ X86::VCVTSS2SIZrm , X86::VCVTSS2SIrm },
				{ X86::VCVTSS2SIZrr , X86::VCVTSS2SIrr },
				{ X86::VCVTTSD2SI64Zrm , X86::VCVTTSD2SI64rm },
				{ X86::VCVTTSD2SI64Zrm_Int , X86::Int_VCVTTSD2SI64rm },
				{ X86::VCVTTSD2SI64Zrr , X86::VCVTTSD2SI64rr },
				{ X86::VCVTTSD2SI64Zrr_Int , X86::Int_VCVTTSD2SI64rr },
				{ X86::VCVTTSD2SIZrm , X86::VCVTTSD2SIrm },
				{ X86::VCVTTSD2SIZrm_Int , X86::Int_VCVTTSD2SIrm },
				{ X86::VCVTTSD2SIZrr , X86::VCVTTSD2SIrr },
				{ X86::VCVTTSD2SIZrr_Int , X86::Int_VCVTTSD2SIrr },
				{ X86::VCVTTSS2SI64Zrm , X86::VCVTTSS2SI64rm },
				{ X86::VCVTTSS2SI64Zrm_Int , X86::Int_VCVTTSS2SI64rm },
				{ X86::VCVTTSS2SI64Zrr , X86::VCVTTSS2SI64rr },
				{ X86::VCVTTSS2SI64Zrr_Int , X86::Int_VCVTTSS2SI64rr },
				{ X86::VCVTTSS2SIZrm , X86::VCVTTSS2SIrm },
				{ X86::VCVTTSS2SIZrm_Int , X86::Int_VCVTTSS2SIrm },
				{ X86::VCVTTSS2SIZrr , X86::VCVTTSS2SIrr },
				{ X86::VCVTTSS2SIZrr_Int , X86::Int_VCVTTSS2SIrr },
				{ X86::VDIVSDZrm , X86::VDIVSDrm },
				{ X86::VDIVSDZrm_Int , X86::VDIVSDrm_Int },
				{ X86::VDIVSDZrr , X86::VDIVSDrr },
				{ X86::VDIVSDZrr_Int , X86::VDIVSDrr_Int },
				{ X86::VDIVSSZrm , X86::VDIVSSrm },
				{ X86::VDIVSSZrm_Int , X86::VDIVSSrm_Int },
				{ X86::VDIVSSZrr , X86::VDIVSSrr },
				{ X86::VDIVSSZrr_Int , X86::VDIVSSrr_Int },
				{ X86::VFMADD132SDZm , X86::VFMADD132SDm },
				{ X86::VFMADD132SDZm_Int , X86::VFMADD132SDm_Int },
				{ X86::VFMADD132SDZr , X86::VFMADD132SDr },
				{ X86::VFMADD132SDZr_Int , X86::VFMADD132SDr_Int },
				{ X86::VFMADD132SSZm , X86::VFMADD132SSm },
				{ X86::VFMADD132SSZm_Int , X86::VFMADD132SSm_Int },
				{ X86::VFMADD132SSZr , X86::VFMADD132SSr },
				{ X86::VFMADD132SSZr_Int , X86::VFMADD132SSr_Int },
				{ X86::VFMADD213SDZm , X86::VFMADD213SDm },
				{ X86::VFMADD213SDZm_Int , X86::VFMADD213SDm_Int },
				{ X86::VFMADD213SDZr , X86::VFMADD213SDr },
				{ X86::VFMADD213SDZr_Int , X86::VFMADD213SDr_Int },
				{ X86::VFMADD213SSZm , X86::VFMADD213SSm },
				{ X86::VFMADD213SSZm_Int , X86::VFMADD213SSm_Int },
				{ X86::VFMADD213SSZr , X86::VFMADD213SSr },
				{ X86::VFMADD213SSZr_Int , X86::VFMADD213SSr_Int },
				{ X86::VFMADD231SDZm , X86::VFMADD231SDm },
				{ X86::VFMADD231SDZm_Int , X86::VFMADD231SDm_Int },
				{ X86::VFMADD231SDZr , X86::VFMADD231SDr },
				{ X86::VFMADD231SDZr_Int , X86::VFMADD231SDr_Int },
				{ X86::VFMADD231SSZm , X86::VFMADD231SSm },
				{ X86::VFMADD231SSZm_Int , X86::VFMADD231SSm_Int },
				{ X86::VFMADD231SSZr , X86::VFMADD231SSr },
				{ X86::VFMADD231SSZr_Int , X86::VFMADD231SSr_Int },
				{ X86::VFMSUB132SDZm , X86::VFMSUB132SDm },
				{ X86::VFMSUB132SDZm_Int , X86::VFMSUB132SDm_Int },
				{ X86::VFMSUB132SDZr , X86::VFMSUB132SDr },
				{ X86::VFMSUB132SDZr_Int , X86::VFMSUB132SDr_Int },
				{ X86::VFMSUB132SSZm , X86::VFMSUB132SSm },
				{ X86::VFMSUB132SSZm_Int , X86::VFMSUB132SSm_Int },
				{ X86::VFMSUB132SSZr , X86::VFMSUB132SSr },
				{ X86::VFMSUB132SSZr_Int , X86::VFMSUB132SSr_Int },
				{ X86::VFMSUB213SDZm , X86::VFMSUB213SDm },
				{ X86::VFMSUB213SDZm_Int , X86::VFMSUB213SDm_Int },
				{ X86::VFMSUB213SDZr , X86::VFMSUB213SDr },
				{ X86::VFMSUB213SDZr_Int , X86::VFMSUB213SDr_Int },
				{ X86::VFMSUB213SSZm , X86::VFMSUB213SSm },
				{ X86::VFMSUB213SSZm_Int , X86::VFMSUB213SSm_Int },
				{ X86::VFMSUB213SSZr , X86::VFMSUB213SSr },
				{ X86::VFMSUB213SSZr_Int , X86::VFMSUB213SSr_Int },
				{ X86::VFMSUB231SDZm , X86::VFMSUB231SDm },
				{ X86::VFMSUB231SDZm_Int , X86::VFMSUB231SDm_Int },
				{ X86::VFMSUB231SDZr , X86::VFMSUB231SDr },
				{ X86::VFMSUB231SDZr_Int , X86::VFMSUB231SDr_Int },
				{ X86::VFMSUB231SSZm , X86::VFMSUB231SSm },
				{ X86::VFMSUB231SSZm_Int , X86::VFMSUB231SSm_Int },
				{ X86::VFMSUB231SSZr , X86::VFMSUB231SSr },
				{ X86::VFMSUB231SSZr_Int , X86::VFMSUB231SSr_Int },
				{ X86::VFNMADD132SDZm , X86::VFNMADD132SDm },
				{ X86::VFNMADD132SDZm_Int , X86::VFNMADD132SDm_Int },
				{ X86::VFNMADD132SDZr , X86::VFNMADD132SDr },
				{ X86::VFNMADD132SDZr_Int , X86::VFNMADD132SDr_Int },
				{ X86::VFNMADD132SSZm , X86::VFNMADD132SSm },
				{ X86::VFNMADD132SSZm_Int , X86::VFNMADD132SSm_Int },
				{ X86::VFNMADD132SSZr , X86::VFNMADD132SSr },
				{ X86::VFNMADD132SSZr_Int , X86::VFNMADD132SSr_Int },
				{ X86::VFNMADD213SDZm , X86::VFNMADD213SDm },
				{ X86::VFNMADD213SDZm_Int , X86::VFNMADD213SDm_Int },
				{ X86::VFNMADD213SDZr , X86::VFNMADD213SDr },
				{ X86::VFNMADD213SDZr_Int , X86::VFNMADD213SDr_Int },
				{ X86::VFNMADD213SSZm , X86::VFNMADD213SSm },
				{ X86::VFNMADD213SSZm_Int , X86::VFNMADD213SSm_Int },
				{ X86::VFNMADD213SSZr , X86::VFNMADD213SSr },
				{ X86::VFNMADD213SSZr_Int , X86::VFNMADD213SSr_Int },
				{ X86::VFNMADD231SDZm , X86::VFNMADD231SDm },
				{ X86::VFNMADD231SDZm_Int , X86::VFNMADD231SDm_Int },
				{ X86::VFNMADD231SDZr , X86::VFNMADD231SDr },
				{ X86::VFNMADD231SDZr_Int , X86::VFNMADD231SDr_Int },
				{ X86::VFNMADD231SSZm , X86::VFNMADD231SSm },
				{ X86::VFNMADD231SSZm_Int , X86::VFNMADD231SSm_Int },
				{ X86::VFNMADD231SSZr , X86::VFNMADD231SSr },
				{ X86::VFNMADD231SSZr_Int , X86::VFNMADD231SSr_Int },
				{ X86::VFNMSUB132SDZm , X86::VFNMSUB132SDm },
				{ X86::VFNMSUB132SDZm_Int , X86::VFNMSUB132SDm_Int },
				{ X86::VFNMSUB132SDZr , X86::VFNMSUB132SDr },
				{ X86::VFNMSUB132SDZr_Int , X86::VFNMSUB132SDr_Int },
				{ X86::VFNMSUB132SSZm , X86::VFNMSUB132SSm },
				{ X86::VFNMSUB132SSZm_Int , X86::VFNMSUB132SSm_Int },
				{ X86::VFNMSUB132SSZr , X86::VFNMSUB132SSr },
				{ X86::VFNMSUB132SSZr_Int , X86::VFNMSUB132SSr_Int },
				{ X86::VFNMSUB213SDZm , X86::VFNMSUB213SDm },
				{ X86::VFNMSUB213SDZm_Int , X86::VFNMSUB213SDm_Int },
				{ X86::VFNMSUB213SDZr , X86::VFNMSUB213SDr },
				{ X86::VFNMSUB213SDZr_Int , X86::VFNMSUB213SDr_Int },
				{ X86::VFNMSUB213SSZm , X86::VFNMSUB213SSm },
				{ X86::VFNMSUB213SSZm_Int , X86::VFNMSUB213SSm_Int },
				{ X86::VFNMSUB213SSZr , X86::VFNMSUB213SSr },
				{ X86::VFNMSUB213SSZr_Int , X86::VFNMSUB213SSr_Int },
				{ X86::VFNMSUB231SDZm , X86::VFNMSUB231SDm },
				{ X86::VFNMSUB231SDZm_Int , X86::VFNMSUB231SDm_Int },
				{ X86::VFNMSUB231SDZr , X86::VFNMSUB231SDr },
				{ X86::VFNMSUB231SDZr_Int , X86::VFNMSUB231SDr_Int },
				{ X86::VFNMSUB231SSZm , X86::VFNMSUB231SSm },
				{ X86::VFNMSUB231SSZm_Int , X86::VFNMSUB231SSm_Int },
				{ X86::VFNMSUB231SSZr , X86::VFNMSUB231SSr },
				{ X86::VFNMSUB231SSZr_Int , X86::VFNMSUB231SSr_Int },
				{ X86::VMAXCSDZrm , X86::VMAXCSDrm },
				{ X86::VMAXCSDZrr , X86::VMAXCSDrr },
				{ X86::VMAXCSSZrm , X86::VMAXCSSrm },
				{ X86::VMAXCSSZrr , X86::VMAXCSSrr },
				{ X86::VMAXSDZrm , X86::VMAXSDrm },
				{ X86::VMAXSDZrm_Int , X86::VMAXSDrm_Int },
				{ X86::VMAXSDZrr , X86::VMAXSDrr },
				{ X86::VMAXSDZrr_Int , X86::VMAXSDrr_Int },
				{ X86::VMAXSSZrm , X86::VMAXSSrm },
				{ X86::VMAXSSZrm_Int , X86::VMAXSSrm_Int },
				{ X86::VMAXSSZrr , X86::VMAXSSrr },
				{ X86::VMAXSSZrr_Int , X86::VMAXSSrr_Int },
				{ X86::VMINCSDZrm , X86::VMINCSDrm },
				{ X86::VMINCSDZrr , X86::VMINCSDrr },
				{ X86::VMINCSSZrm , X86::VMINCSSrm },
				{ X86::VMINCSSZrr , X86::VMINCSSrr },
				{ X86::VMINSDZrm , X86::VMINSDrm },
				{ X86::VMINSDZrm_Int , X86::VMINSDrm_Int },
				{ X86::VMINSDZrr , X86::VMINSDrr },
				{ X86::VMINSDZrr_Int , X86::VMINSDrr_Int },
				{ X86::VMINSSZrm , X86::VMINSSrm },
				{ X86::VMINSSZrm_Int , X86::VMINSSrm_Int },
				{ X86::VMINSSZrr , X86::VMINSSrr },
				{ X86::VMINSSZrr_Int , X86::VMINSSrr_Int },
				{ X86::VMOV64toSDZrr , X86::VMOV64toSDrr },
				{ X86::VMOVDI2SSZrm , X86::VMOVDI2SSrm },
				{ X86::VMOVDI2SSZrr , X86::VMOVDI2SSrr },
				{ X86::VMOVSDZmr , X86::VMOVSDmr },
				{ X86::VMOVSDZrm , X86::VMOVSDrm },
				{ X86::VMOVSDZrr , X86::VMOVSDrr },
				{ X86::VMOVSSZmr , X86::VMOVSSmr },
				{ X86::VMOVSSZrm , X86::VMOVSSrm },
				{ X86::VMOVSSZrr , X86::VMOVSSrr },
				{ X86::VMOVSSZrr_REV , X86::VMOVSSrr_REV },
				{ X86::VMULSDZrm , X86::VMULSDrm },
				{ X86::VMULSDZrm_Int , X86::VMULSDrm_Int },
				{ X86::VMULSDZrr , X86::VMULSDrr },
				{ X86::VMULSDZrr_Int , X86::VMULSDrr_Int },
				{ X86::VMULSSZrm , X86::VMULSSrm },
				{ X86::VMULSSZrm_Int , X86::VMULSSrm_Int },
				{ X86::VMULSSZrr , X86::VMULSSrr },
				{ X86::VMULSSZrr_Int , X86::VMULSSrr_Int },
				{ X86::VSQRTSDZm , X86::VSQRTSDm },
				{ X86::VSQRTSDZm_Int , X86::VSQRTSDm_Int },
				{ X86::VSQRTSDZr , X86::VSQRTSDr },
				{ X86::VSQRTSDZr_Int , X86::VSQRTSDr_Int },
				{ X86::VSQRTSSZm , X86::VSQRTSSm },
				{ X86::VSQRTSSZm_Int , X86::VSQRTSSm_Int },
				{ X86::VSQRTSSZr , X86::VSQRTSSr },
				{ X86::VSQRTSSZr_Int , X86::VSQRTSSr_Int },
				{ X86::VSUBSDZrm , X86::VSUBSDrm },
				{ X86::VSUBSDZrm_Int , X86::VSUBSDrm_Int },
				{ X86::VSUBSDZrr , X86::VSUBSDrr },
				{ X86::VSUBSDZrr_Int , X86::VSUBSDrr_Int },
				{ X86::VSUBSSZrm , X86::VSUBSSrm },
				{ X86::VSUBSSZrm_Int , X86::VSUBSSrm_Int },
				{ X86::VSUBSSZrr , X86::VSUBSSrr },
				{ X86::VSUBSSZrr_Int , X86::VSUBSSrr_Int },
				{ X86::VUCOMISDZrm , X86::VUCOMISDrm },
				{ X86::VUCOMISDZrr , X86::VUCOMISDrr },
				{ X86::VUCOMISSZrm , X86::VUCOMISSrm },
				{ X86::VUCOMISSZrr , X86::VUCOMISSrr },

				{ X86::VMOV64toPQIZrr , X86::VMOV64toPQIrr },
				{ X86::VMOV64toSDZrr , X86::VMOV64toSDrr },
				{ X86::VMOVDI2PDIZrm , X86::VMOVDI2PDIrm },
				{ X86::VMOVDI2PDIZrr , X86::VMOVDI2PDIrr },
				{ X86::VMOVLHPSZrr , X86::VMOVLHPSrr },
				{ X86::VMOVHLPSZrr , X86::VMOVHLPSrr },
				{ X86::VMOVPDI2DIZmr , X86::VMOVPDI2DImr },
				{ X86::VMOVPDI2DIZrr , X86::VMOVPDI2DIrr },
				{ X86::VMOVPQI2QIZmr , X86::VMOVPQI2QImr },
				{ X86::VMOVPQIto64Zrr , X86::VMOVPQIto64rr },
				{ X86::VMOVQI2PQIZrm , X86::VMOVQI2PQIrm },
				{ X86::VMOVZPQILo2PQIZrr , X86::VMOVZPQILo2PQIrr },

				{ X86::VPEXTRBZmr , X86::VPEXTRBmr },
				{ X86::VPEXTRBZrr , X86::VPEXTRBrr },
				{ X86::VPEXTRDZmr , X86::VPEXTRDmr },
				{ X86::VPEXTRDZrr , X86::VPEXTRDrr },
				{ X86::VPEXTRQZmr , X86::VPEXTRQmr },
				{ X86::VPEXTRQZrr , X86::VPEXTRQrr },
				{ X86::VPEXTRWZmr , X86::VPEXTRWmr },
				{ X86::VPEXTRWZrr , X86::VPEXTRWri },

				{ X86::VPINSRBZrm , X86::VPINSRBrm },
				{ X86::VPINSRBZrr , X86::VPINSRBrr },
				{ X86::VPINSRDZrm , X86::VPINSRDrm },
				{ X86::VPINSRDZrr , X86::VPINSRDrr },
				{ X86::VPINSRQZrm , X86::VPINSRQrm },
				{ X86::VPINSRQZrr , X86::VPINSRQrr },
				{ X86::VPINSRWZrm , X86::VPINSRWrmi },
				{ X86::VPINSRWZrr , X86::VPINSRWrri },

				// EVEX 128 with corresponding VEX.
				{ X86::VADDPDZ128rm , X86::VADDPDrm },
				{ X86::VADDPDZ128rr , X86::VADDPDrr },
				{ X86::VADDPSZ128rm , X86::VADDPSrm },
				{ X86::VADDPSZ128rr , X86::VADDPSrr },
				{ X86::VANDNPDZ128rm , X86::VANDNPDrm },
				{ X86::VANDNPDZ128rr , X86::VANDNPDrr },
				{ X86::VANDNPSZ128rm , X86::VANDNPSrm },
				{ X86::VANDNPSZ128rr , X86::VANDNPSrr },
				{ X86::VANDPDZ128rm , X86::VANDPDrm },
				{ X86::VANDPDZ128rr , X86::VANDPDrr },
				{ X86::VANDPSZ128rm , X86::VANDPSrm },
				{ X86::VANDPSZ128rr , X86::VANDPSrr },
				{ X86::VBROADCASTSSZ128m , X86::VBROADCASTSSrm },
				{ X86::VBROADCASTSSZ128r , X86::VBROADCASTSSrr },
				{ X86::VBROADCASTSSZ128r_s , X86::VBROADCASTSSrr },
				{ X86::VCVTDQ2PDZ128rm , X86::VCVTDQ2PDrm },
				{ X86::VCVTDQ2PDZ128rr , X86::VCVTDQ2PDrr },
				{ X86::VCVTDQ2PSZ128rm , X86::VCVTDQ2PSrm },
				{ X86::VCVTDQ2PSZ128rr , X86::VCVTDQ2PSrr },
				{ X86::VCVTPD2DQZ128rm , X86::VCVTPD2DQrm },
				{ X86::VCVTPD2DQZ128rr , X86::VCVTPD2DQrr },
				{ X86::VCVTPD2PSZ128rm , X86::VCVTPD2PSrm },
				{ X86::VCVTPD2PSZ128rr , X86::VCVTPD2PSrr },
				{ X86::VCVTPH2PSZ128rm , X86::VCVTPH2PSrm },
				{ X86::VCVTPH2PSZ128rr , X86::VCVTPH2PSrr },
				{ X86::VCVTPS2DQZ128rm , X86::VCVTPS2DQrm },
				{ X86::VCVTPS2DQZ128rr , X86::VCVTPS2DQrr },
				{ X86::VCVTPS2PDZ128rm , X86::VCVTPS2PDrm },
				{ X86::VCVTPS2PDZ128rr , X86::VCVTPS2PDrr },
				{ X86::VCVTPS2PHZ128mr , X86::VCVTPS2PHmr },
				{ X86::VCVTPS2PHZ128rr , X86::VCVTPS2PHrr },
				{ X86::VCVTTPD2DQZ128rm , X86::VCVTTPD2DQrm },
				{ X86::VCVTTPD2DQZ128rr , X86::VCVTTPD2DQrr },
				{ X86::VCVTTPS2DQZ128rm , X86::VCVTTPS2DQrm },
				{ X86::VCVTTPS2DQZ128rr , X86::VCVTTPS2DQrr },
				{ X86::VDIVPDZ128rm , X86::VDIVPDrm },
				{ X86::VDIVPDZ128rr , X86::VDIVPDrr },
				{ X86::VDIVPSZ128rm , X86::VDIVPSrm },
				{ X86::VDIVPSZ128rr , X86::VDIVPSrr },
				{ X86::VFMADD132PDZ128m , X86::VFMADD132PDm },
				{ X86::VFMADD132PDZ128r , X86::VFMADD132PDr },
				{ X86::VFMADD132PSZ128m , X86::VFMADD132PSm },
				{ X86::VFMADD132PSZ128r , X86::VFMADD132PSr },
				{ X86::VFMADD213PDZ128m , X86::VFMADD213PDm },
				{ X86::VFMADD213PDZ128r , X86::VFMADD213PDr },
				{ X86::VFMADD213PSZ128m , X86::VFMADD213PSm },
				{ X86::VFMADD213PSZ128r , X86::VFMADD213PSr },
				{ X86::VFMADD231PDZ128m , X86::VFMADD231PDm },
				{ X86::VFMADD231PDZ128r , X86::VFMADD231PDr },
				{ X86::VFMADD231PSZ128m , X86::VFMADD231PSm },
				{ X86::VFMADD231PSZ128r , X86::VFMADD231PSr },
				{ X86::VFMADDSUB132PDZ128m , X86::VFMADDSUB132PDm },
				{ X86::VFMADDSUB132PDZ128r , X86::VFMADDSUB132PDr },
				{ X86::VFMADDSUB132PSZ128m , X86::VFMADDSUB132PSm },
				{ X86::VFMADDSUB132PSZ128r , X86::VFMADDSUB132PSr },
				{ X86::VFMADDSUB213PDZ128m , X86::VFMADDSUB213PDm },
				{ X86::VFMADDSUB213PDZ128r , X86::VFMADDSUB213PDr },
				{ X86::VFMADDSUB213PSZ128m , X86::VFMADDSUB213PSm },
				{ X86::VFMADDSUB213PSZ128r , X86::VFMADDSUB213PSr },
				{ X86::VFMADDSUB231PDZ128m , X86::VFMADDSUB231PDm },
				{ X86::VFMADDSUB231PDZ128r , X86::VFMADDSUB231PDr },
				{ X86::VFMADDSUB231PSZ128m , X86::VFMADDSUB231PSm },
				{ X86::VFMADDSUB231PSZ128r , X86::VFMADDSUB231PSr },
				{ X86::VFMSUB132PDZ128m , X86::VFMSUB132PDm },
				{ X86::VFMSUB132PDZ128r , X86::VFMSUB132PDr },
				{ X86::VFMSUB132PSZ128m , X86::VFMSUB132PSm },
				{ X86::VFMSUB132PSZ128r , X86::VFMSUB132PSr },
				{ X86::VFMSUB213PDZ128m , X86::VFMSUB213PDm },
				{ X86::VFMSUB213PDZ128r , X86::VFMSUB213PDr },
				{ X86::VFMSUB213PSZ128m , X86::VFMSUB213PSm },
				{ X86::VFMSUB213PSZ128r , X86::VFMSUB213PSr },
				{ X86::VFMSUB231PDZ128m , X86::VFMSUB231PDm },
				{ X86::VFMSUB231PDZ128r , X86::VFMSUB231PDr },
				{ X86::VFMSUB231PSZ128m , X86::VFMSUB231PSm },
				{ X86::VFMSUB231PSZ128r , X86::VFMSUB231PSr },
				{ X86::VFMSUBADD132PDZ128m , X86::VFMSUBADD132PDm },
				{ X86::VFMSUBADD132PDZ128r , X86::VFMSUBADD132PDr },
				{ X86::VFMSUBADD132PSZ128m , X86::VFMSUBADD132PSm },
				{ X86::VFMSUBADD132PSZ128r , X86::VFMSUBADD132PSr },
				{ X86::VFMSUBADD213PDZ128m , X86::VFMSUBADD213PDm },
				{ X86::VFMSUBADD213PDZ128r , X86::VFMSUBADD213PDr },
				{ X86::VFMSUBADD213PSZ128m , X86::VFMSUBADD213PSm },
				{ X86::VFMSUBADD213PSZ128r , X86::VFMSUBADD213PSr },
				{ X86::VFMSUBADD231PDZ128m , X86::VFMSUBADD231PDm },
				{ X86::VFMSUBADD231PDZ128r , X86::VFMSUBADD231PDr },
				{ X86::VFMSUBADD231PSZ128m , X86::VFMSUBADD231PSm },
				{ X86::VFMSUBADD231PSZ128r , X86::VFMSUBADD231PSr },
				{ X86::VFNMADD132PDZ128m , X86::VFNMADD132PDm },
				{ X86::VFNMADD132PDZ128r , X86::VFNMADD132PDr },
				{ X86::VFNMADD132PSZ128m , X86::VFNMADD132PSm },
				{ X86::VFNMADD132PSZ128r , X86::VFNMADD132PSr },
				{ X86::VFNMADD213PDZ128m , X86::VFNMADD213PDm },
				{ X86::VFNMADD213PDZ128r , X86::VFNMADD213PDr },
				{ X86::VFNMADD213PSZ128m , X86::VFNMADD213PSm },
				{ X86::VFNMADD213PSZ128r , X86::VFNMADD213PSr },
				{ X86::VFNMADD231PDZ128m , X86::VFNMADD231PDm },
				{ X86::VFNMADD231PDZ128r , X86::VFNMADD231PDr },
				{ X86::VFNMADD231PSZ128m , X86::VFNMADD231PSm },
				{ X86::VFNMADD231PSZ128r , X86::VFNMADD231PSr },
				{ X86::VFNMSUB132PDZ128m , X86::VFNMSUB132PDm },
				{ X86::VFNMSUB132PDZ128r , X86::VFNMSUB132PDr },
				{ X86::VFNMSUB132PSZ128m , X86::VFNMSUB132PSm },
				{ X86::VFNMSUB132PSZ128r , X86::VFNMSUB132PSr },
				{ X86::VFNMSUB213PDZ128m , X86::VFNMSUB213PDm },
				{ X86::VFNMSUB213PDZ128r , X86::VFNMSUB213PDr },
				{ X86::VFNMSUB213PSZ128m , X86::VFNMSUB213PSm },
				{ X86::VFNMSUB213PSZ128r , X86::VFNMSUB213PSr },
				{ X86::VFNMSUB231PDZ128m , X86::VFNMSUB231PDm },
				{ X86::VFNMSUB231PDZ128r , X86::VFNMSUB231PDr },
				{ X86::VFNMSUB231PSZ128m , X86::VFNMSUB231PSm },
				{ X86::VFNMSUB231PSZ128r , X86::VFNMSUB231PSr },
				{ X86::VMAXCPDZ128rm , X86::VMAXCPDrm },
				{ X86::VMAXCPDZ128rr , X86::VMAXCPDrr },
				{ X86::VMAXCPSZ128rm , X86::VMAXCPSrm },
				{ X86::VMAXCPSZ128rr , X86::VMAXCPSrr },
				{ X86::VMAXPDZ128rm , X86::VMAXPDrm },
				{ X86::VMAXPDZ128rr , X86::VMAXPDrr },
				{ X86::VMAXPSZ128rm , X86::VMAXPSrm },
				{ X86::VMAXPSZ128rr , X86::VMAXPSrr },
				{ X86::VMINCPDZ128rm , X86::VMINCPDrm },
				{ X86::VMINCPDZ128rr , X86::VMINCPDrr },
				{ X86::VMINCPSZ128rm , X86::VMINCPSrm },
				{ X86::VMINCPSZ128rr , X86::VMINCPSrr },
				{ X86::VMINPDZ128rm , X86::VMINPDrm },
				{ X86::VMINPDZ128rr , X86::VMINPDrr },
				{ X86::VMINPSZ128rm , X86::VMINPSrm },
				{ X86::VMINPSZ128rr , X86::VMINPSrr },
				{ X86::VMOVAPDZ128mr , X86::VMOVAPDmr },
				{ X86::VMOVAPDZ128rm , X86::VMOVAPDrm },
				{ X86::VMOVAPDZ128rr , X86::VMOVAPDrr },
				{ X86::VMOVAPDZ128rr_REV , X86::VMOVAPDrr_REV },
				{ X86::VMOVAPSZ128mr , X86::VMOVAPSmr },
				{ X86::VMOVAPSZ128rm , X86::VMOVAPSrm },
				{ X86::VMOVAPSZ128rr , X86::VMOVAPSrr },
				{ X86::VMOVAPSZ128rr_REV , X86::VMOVAPSrr_REV },
				{ X86::VMOVDDUPZ128rm , X86::VMOVDDUPrm },
				{ X86::VMOVDDUPZ128rr , X86::VMOVDDUPrr },
				{ X86::VMOVDQA32Z128mr , X86::VMOVDQAmr },
				{ X86::VMOVDQA32Z128rm , X86::VMOVDQArm },
				{ X86::VMOVDQA32Z128rr , X86::VMOVDQArr },
				{ X86::VMOVDQA32Z128rr_REV , X86::VMOVDQArr_REV },
				{ X86::VMOVDQA64Z128mr , X86::VMOVDQAmr },
				{ X86::VMOVDQA64Z128rm , X86::VMOVDQArm },
				{ X86::VMOVDQA64Z128rr , X86::VMOVDQArr },
				{ X86::VMOVDQA64Z128rr_REV , X86::VMOVDQArr_REV },
				{ X86::VMOVDQU16Z128mr , X86::VMOVDQUmr },
				{ X86::VMOVDQU16Z128rm , X86::VMOVDQUrm },
				{ X86::VMOVDQU16Z128rr , X86::VMOVDQUrr },
				{ X86::VMOVDQU16Z128rr_REV , X86::VMOVDQUrr_REV },
				{ X86::VMOVDQU32Z128mr , X86::VMOVDQUmr },
				{ X86::VMOVDQU32Z128rm , X86::VMOVDQUrm },
				{ X86::VMOVDQU32Z128rr , X86::VMOVDQUrr },
				{ X86::VMOVDQU32Z128rr_REV , X86::VMOVDQUrr_REV },
				{ X86::VMOVDQU64Z128mr , X86::VMOVDQUmr },
				{ X86::VMOVDQU64Z128rm , X86::VMOVDQUrm },
				{ X86::VMOVDQU64Z128rr , X86::VMOVDQUrr },
				{ X86::VMOVDQU64Z128rr_REV , X86::VMOVDQUrr_REV },
				{ X86::VMOVDQU8Z128mr , X86::VMOVDQUmr },
				{ X86::VMOVDQU8Z128rm , X86::VMOVDQUrm },
				{ X86::VMOVDQU8Z128rr , X86::VMOVDQUrr },
				{ X86::VMOVDQU8Z128rr_REV , X86::VMOVDQUrr_REV },
				{ X86::VMOVHPDZ128mr , X86::VMOVHPDmr },
				{ X86::VMOVHPDZ128rm , X86::VMOVHPDrm },
				{ X86::VMOVHPSZ128mr , X86::VMOVHPSmr },
				{ X86::VMOVHPSZ128rm , X86::VMOVHPSrm },
				{ X86::VMOVLPDZ128mr , X86::VMOVLPDmr },
				{ X86::VMOVLPDZ128rm , X86::VMOVLPDrm },
				{ X86::VMOVLPSZ128mr , X86::VMOVLPSmr },
				{ X86::VMOVLPSZ128rm , X86::VMOVLPSrm },
				{ X86::VMOVNTDQAZ128rm , X86::VMOVNTDQArm },
				{ X86::VMOVNTDQZ128mr , X86::VMOVNTDQmr },
				{ X86::VMOVNTPDZ128mr , X86::VMOVNTPDmr },
				{ X86::VMOVNTPSZ128mr , X86::VMOVNTPSmr },
				{ X86::VMOVSHDUPZ128rm , X86::VMOVSHDUPrm },
				{ X86::VMOVSHDUPZ128rr , X86::VMOVSHDUPrr },
				{ X86::VMOVSLDUPZ128rm , X86::VMOVSLDUPrm },
				{ X86::VMOVSLDUPZ128rr , X86::VMOVSLDUPrr },
				{ X86::VMOVUPDZ128mr , X86::VMOVUPDmr },
				{ X86::VMOVUPDZ128rm , X86::VMOVUPDrm },
				{ X86::VMOVUPDZ128rr , X86::VMOVUPDrr },
				{ X86::VMOVUPDZ128rr_REV , X86::VMOVUPDrr_REV },
				{ X86::VMOVUPSZ128mr , X86::VMOVUPSmr },
				{ X86::VMOVUPSZ128rm , X86::VMOVUPSrm },
				{ X86::VMOVUPSZ128rr , X86::VMOVUPSrr },
				{ X86::VMOVUPSZ128rr_REV , X86::VMOVUPSrr_REV },
				{ X86::VMULPDZ128rm , X86::VMULPDrm },
				{ X86::VMULPDZ128rr , X86::VMULPDrr },
				{ X86::VMULPSZ128rm , X86::VMULPSrm },
				{ X86::VMULPSZ128rr , X86::VMULPSrr },
				{ X86::VORPDZ128rm , X86::VORPDrm },
				{ X86::VORPDZ128rr , X86::VORPDrr },
				{ X86::VORPSZ128rm , X86::VORPSrm },
				{ X86::VORPSZ128rr , X86::VORPSrr },
				{ X86::VPABSBZ128rm , X86::VPABSBrm },
				{ X86::VPABSBZ128rr , X86::VPABSBrr },
				{ X86::VPABSDZ128rm , X86::VPABSDrm },
				{ X86::VPABSDZ128rr , X86::VPABSDrr },
				{ X86::VPABSWZ128rm , X86::VPABSWrm },
				{ X86::VPABSWZ128rr , X86::VPABSWrr },
				{ X86::VPACKSSDWZ128rm , X86::VPACKSSDWrm },
				{ X86::VPACKSSDWZ128rr , X86::VPACKSSDWrr },
				{ X86::VPACKSSWBZ128rm , X86::VPACKSSWBrm },
				{ X86::VPACKSSWBZ128rr , X86::VPACKSSWBrr },
				{ X86::VPACKUSDWZ128rm , X86::VPACKUSDWrm },
				{ X86::VPACKUSDWZ128rr , X86::VPACKUSDWrr },
				{ X86::VPACKUSWBZ128rm , X86::VPACKUSWBrm },
				{ X86::VPACKUSWBZ128rr , X86::VPACKUSWBrr },
				{ X86::VPADDBZ128rm , X86::VPADDBrm },
				{ X86::VPADDBZ128rr , X86::VPADDBrr },
				{ X86::VPADDDZ128rm , X86::VPADDDrm },
				{ X86::VPADDDZ128rr , X86::VPADDDrr },
				{ X86::VPADDQZ128rm , X86::VPADDQrm },
				{ X86::VPADDQZ128rr , X86::VPADDQrr },
				{ X86::VPADDSBZ128rm , X86::VPADDSBrm },
				{ X86::VPADDSBZ128rr , X86::VPADDSBrr },
				{ X86::VPADDSWZ128rm , X86::VPADDSWrm },
				{ X86::VPADDSWZ128rr , X86::VPADDSWrr },
				{ X86::VPADDUSBZ128rm , X86::VPADDUSBrm },
				{ X86::VPADDUSBZ128rr , X86::VPADDUSBrr },
				{ X86::VPADDUSWZ128rm , X86::VPADDUSWrm },
				{ X86::VPADDUSWZ128rr , X86::VPADDUSWrr },
				{ X86::VPADDWZ128rm , X86::VPADDWrm },
				{ X86::VPADDWZ128rr , X86::VPADDWrr },
				{ X86::VPALIGNRZ128rmi , X86::VPALIGNRrmi },
				{ X86::VPALIGNRZ128rri , X86::VPALIGNRrri },
				{ X86::VPANDDZ128rm , X86::VPANDrm },
				{ X86::VPANDDZ128rr , X86::VPANDrr },
				{ X86::VPANDQZ128rm , X86::VPANDrm },
				{ X86::VPANDQZ128rr , X86::VPANDrr },
				{ X86::VPAVGBZ128rm , X86::VPAVGBrm },
				{ X86::VPAVGBZ128rr , X86::VPAVGBrr },
				{ X86::VPAVGWZ128rm , X86::VPAVGWrm },
				{ X86::VPAVGWZ128rr , X86::VPAVGWrr },
				{ X86::VPBROADCASTBZ128m , X86::VPBROADCASTBrm },
				{ X86::VPBROADCASTBZ128r , X86::VPBROADCASTBrr },
				{ X86::VPBROADCASTDZ128m , X86::VPBROADCASTDrm },
				{ X86::VPBROADCASTDZ128r , X86::VPBROADCASTDrr },
				{ X86::VPBROADCASTQZ128m , X86::VPBROADCASTQrm },
				{ X86::VPBROADCASTQZ128r , X86::VPBROADCASTQrr },
				{ X86::VPBROADCASTWZ128m , X86::VPBROADCASTWrm },
				{ X86::VPBROADCASTWZ128r , X86::VPBROADCASTWrr },
				{ X86::VPERMILPDZ128mi , X86::VPERMILPDmi },
				{ X86::VPERMILPDZ128ri , X86::VPERMILPDri },
				{ X86::VPERMILPDZ128rm , X86::VPERMILPDrm },
				{ X86::VPERMILPDZ128rr , X86::VPERMILPDrr },
				{ X86::VPERMILPSZ128mi , X86::VPERMILPSmi },
				{ X86::VPERMILPSZ128ri , X86::VPERMILPSri },
				{ X86::VPERMILPSZ128rm , X86::VPERMILPSrm },
				{ X86::VPERMILPSZ128rr , X86::VPERMILPSrr },
				{ X86::VPMADDUBSWZ128rm , X86::VPMADDUBSWrm },
				{ X86::VPMADDUBSWZ128rr , X86::VPMADDUBSWrr },
				{ X86::VPMADDWDZ128rm , X86::VPMADDWDrm },
				{ X86::VPMADDWDZ128rr , X86::VPMADDWDrr },
				{ X86::VPMAXSBZ128rm , X86::VPMAXSBrm },
				{ X86::VPMAXSBZ128rr , X86::VPMAXSBrr },
				{ X86::VPMAXSDZ128rm , X86::VPMAXSDrm },
				{ X86::VPMAXSDZ128rr , X86::VPMAXSDrr },
				{ X86::VPMAXSWZ128rm , X86::VPMAXSWrm },
				{ X86::VPMAXSWZ128rr , X86::VPMAXSWrr },
				{ X86::VPMAXUBZ128rm , X86::VPMAXUBrm },
				{ X86::VPMAXUBZ128rr , X86::VPMAXUBrr },
				{ X86::VPMAXUDZ128rm , X86::VPMAXUDrm },
				{ X86::VPMAXUDZ128rr , X86::VPMAXUDrr },
				{ X86::VPMAXUWZ128rm , X86::VPMAXUWrm },
				{ X86::VPMAXUWZ128rr , X86::VPMAXUWrr },
				{ X86::VPMINSBZ128rm , X86::VPMINSBrm },
				{ X86::VPMINSBZ128rr , X86::VPMINSBrr },
				{ X86::VPMINSDZ128rm , X86::VPMINSDrm },
				{ X86::VPMINSDZ128rr , X86::VPMINSDrr },
				{ X86::VPMINSWZ128rm , X86::VPMINSWrm },
				{ X86::VPMINSWZ128rr , X86::VPMINSWrr },
				{ X86::VPMINUBZ128rm , X86::VPMINUBrm },
				{ X86::VPMINUBZ128rr , X86::VPMINUBrr },
				{ X86::VPMINUDZ128rm , X86::VPMINUDrm },
				{ X86::VPMINUDZ128rr , X86::VPMINUDrr },
				{ X86::VPMINUWZ128rm , X86::VPMINUWrm },
				{ X86::VPMINUWZ128rr , X86::VPMINUWrr },
				{ X86::VPMOVSXBDZ128rm , X86::VPMOVSXBDrm },
				{ X86::VPMOVSXBDZ128rr , X86::VPMOVSXBDrr },
				{ X86::VPMOVSXBQZ128rm , X86::VPMOVSXBQrm },
				{ X86::VPMOVSXBQZ128rr , X86::VPMOVSXBQrr },
				{ X86::VPMOVSXBWZ128rm , X86::VPMOVSXBWrm },
				{ X86::VPMOVSXBWZ128rr , X86::VPMOVSXBWrr },
				{ X86::VPMOVSXDQZ128rm , X86::VPMOVSXDQrm },
				{ X86::VPMOVSXDQZ128rr , X86::VPMOVSXDQrr },
				{ X86::VPMOVSXWDZ128rm , X86::VPMOVSXWDrm },
				{ X86::VPMOVSXWDZ128rr , X86::VPMOVSXWDrr },
				{ X86::VPMOVSXWQZ128rm , X86::VPMOVSXWQrm },
				{ X86::VPMOVSXWQZ128rr , X86::VPMOVSXWQrr },
				{ X86::VPMOVZXBDZ128rm , X86::VPMOVZXBDrm },
				{ X86::VPMOVZXBDZ128rr , X86::VPMOVZXBDrr },
				{ X86::VPMOVZXBQZ128rm , X86::VPMOVZXBQrm },
				{ X86::VPMOVZXBQZ128rr , X86::VPMOVZXBQrr },
				{ X86::VPMOVZXBWZ128rm , X86::VPMOVZXBWrm },
				{ X86::VPMOVZXBWZ128rr , X86::VPMOVZXBWrr },
				{ X86::VPMOVZXDQZ128rm , X86::VPMOVZXDQrm },
				{ X86::VPMOVZXDQZ128rr , X86::VPMOVZXDQrr },
				{ X86::VPMOVZXWDZ128rm , X86::VPMOVZXWDrm },
				{ X86::VPMOVZXWDZ128rr , X86::VPMOVZXWDrr },
				{ X86::VPMOVZXWQZ128rm , X86::VPMOVZXWQrm },
				{ X86::VPMOVZXWQZ128rr , X86::VPMOVZXWQrr },
				{ X86::VPMULDQZ128rm , X86::VPMULDQrm },
				{ X86::VPMULDQZ128rr , X86::VPMULDQrr },
				{ X86::VPMULHRSWZ128rm , X86::VPMULHRSWrm },
				{ X86::VPMULHRSWZ128rr , X86::VPMULHRSWrr },
				{ X86::VPMULHUWZ128rm , X86::VPMULHUWrm },
				{ X86::VPMULHUWZ128rr , X86::VPMULHUWrr },
				{ X86::VPMULHWZ128rm , X86::VPMULHWrm },
				{ X86::VPMULHWZ128rr , X86::VPMULHWrr },
				{ X86::VPMULLDZ128rm , X86::VPMULLDrm },
				{ X86::VPMULLDZ128rr , X86::VPMULLDrr },
				{ X86::VPMULLWZ128rm , X86::VPMULLWrm },
				{ X86::VPMULLWZ128rr , X86::VPMULLWrr },
				{ X86::VPMULUDQZ128rm , X86::VPMULUDQrm },
				{ X86::VPMULUDQZ128rr , X86::VPMULUDQrr },
				{ X86::VPORDZ128rm , X86::VPORrm },
				{ X86::VPORDZ128rr , X86::VPORrr },
				{ X86::VPORQZ128rm , X86::VPORrm },
				{ X86::VPORQZ128rr , X86::VPORrr },
				{ X86::VPSADBWZ128rm , X86::VPSADBWrm },
				{ X86::VPSADBWZ128rr , X86::VPSADBWrr },
				{ X86::VPSHUFBZ128rm , X86::VPSHUFBrm },
				{ X86::VPSHUFBZ128rr , X86::VPSHUFBrr },
				{ X86::VPSHUFDZ128mi , X86::VPSHUFDmi },
				{ X86::VPSHUFDZ128ri , X86::VPSHUFDri },
				{ X86::VPSHUFHWZ128mi , X86::VPSHUFHWmi },
				{ X86::VPSHUFHWZ128ri , X86::VPSHUFHWri },
				{ X86::VPSHUFLWZ128mi , X86::VPSHUFLWmi },
				{ X86::VPSHUFLWZ128ri , X86::VPSHUFLWri },
				{ X86::VPSLLDQZ128rr , X86::VPSLLDQri },
				{ X86::VPSLLDZ128ri , X86::VPSLLDri },
				{ X86::VPSLLDZ128rm , X86::VPSLLDrm },
				{ X86::VPSLLDZ128rr , X86::VPSLLDrr },
				{ X86::VPSLLQZ128ri , X86::VPSLLQri },
				{ X86::VPSLLQZ128rm , X86::VPSLLQrm },
				{ X86::VPSLLQZ128rr , X86::VPSLLQrr },
				{ X86::VPSLLVDZ128rm , X86::VPSLLVDrm },
				{ X86::VPSLLVDZ128rr , X86::VPSLLVDrr },
				{ X86::VPSLLVQZ128rm , X86::VPSLLVQrm },
				{ X86::VPSLLVQZ128rr , X86::VPSLLVQrr },
				{ X86::VPSLLWZ128ri , X86::VPSLLWri },
				{ X86::VPSLLWZ128rm , X86::VPSLLWrm },
				{ X86::VPSLLWZ128rr , X86::VPSLLWrr },
				{ X86::VPSRADZ128ri , X86::VPSRADri },
				{ X86::VPSRADZ128rm , X86::VPSRADrm },
				{ X86::VPSRADZ128rr , X86::VPSRADrr },
				{ X86::VPSRAVDZ128rm , X86::VPSRAVDrm },
				{ X86::VPSRAVDZ128rr , X86::VPSRAVDrr },
				{ X86::VPSRAWZ128ri , X86::VPSRAWri },
				{ X86::VPSRAWZ128rm , X86::VPSRAWrm },
				{ X86::VPSRAWZ128rr , X86::VPSRAWrr },
				{ X86::VPSRLDQZ128rr , X86::VPSRLDQri },
				{ X86::VPSRLDZ128ri , X86::VPSRLDri },
				{ X86::VPSRLDZ128rm , X86::VPSRLDrm },
				{ X86::VPSRLDZ128rr , X86::VPSRLDrr },
				{ X86::VPSRLQZ128ri , X86::VPSRLQri },
				{ X86::VPSRLQZ128rm , X86::VPSRLQrm },
				{ X86::VPSRLQZ128rr , X86::VPSRLQrr },
				{ X86::VPSRLVDZ128rm , X86::VPSRLVDrm },
				{ X86::VPSRLVDZ128rr , X86::VPSRLVDrr },
				{ X86::VPSRLVQZ128rm , X86::VPSRLVQrm },
				{ X86::VPSRLVQZ128rr , X86::VPSRLVQrr },
				{ X86::VPSRLWZ128ri , X86::VPSRLWri },
				{ X86::VPSRLWZ128rm , X86::VPSRLWrm },
				{ X86::VPSRLWZ128rr , X86::VPSRLWrr },
				{ X86::VPSUBBZ128rm , X86::VPSUBBrm },
				{ X86::VPSUBBZ128rr , X86::VPSUBBrr },
				{ X86::VPSUBDZ128rm , X86::VPSUBDrm },
				{ X86::VPSUBDZ128rr , X86::VPSUBDrr },
				{ X86::VPSUBQZ128rm , X86::VPSUBQrm },
				{ X86::VPSUBQZ128rr , X86::VPSUBQrr },
				{ X86::VPSUBSBZ128rm , X86::VPSUBSBrm },
				{ X86::VPSUBSBZ128rr , X86::VPSUBSBrr },
				{ X86::VPSUBSWZ128rm , X86::VPSUBSWrm },
				{ X86::VPSUBSWZ128rr , X86::VPSUBSWrr },
				{ X86::VPSUBUSBZ128rm , X86::VPSUBUSBrm },
				{ X86::VPSUBUSBZ128rr , X86::VPSUBUSBrr },
				{ X86::VPSUBUSWZ128rm , X86::VPSUBUSWrm },
				{ X86::VPSUBUSWZ128rr , X86::VPSUBUSWrr },
				{ X86::VPSUBWZ128rm , X86::VPSUBWrm },
				{ X86::VPSUBWZ128rr , X86::VPSUBWrr },
				{ X86::VPUNPCKHBWZ128rm , X86::VPUNPCKHBWrm },
				{ X86::VPUNPCKHBWZ128rr , X86::VPUNPCKHBWrr },
				{ X86::VPUNPCKHDQZ128rm , X86::VPUNPCKHDQrm },
				{ X86::VPUNPCKHDQZ128rr , X86::VPUNPCKHDQrr },
				{ X86::VPUNPCKHQDQZ128rm , X86::VPUNPCKHQDQrm },
				{ X86::VPUNPCKHQDQZ128rr , X86::VPUNPCKHQDQrr },
				{ X86::VPUNPCKHWDZ128rm , X86::VPUNPCKHWDrm },
				{ X86::VPUNPCKHWDZ128rr , X86::VPUNPCKHWDrr },
				{ X86::VPUNPCKLBWZ128rm , X86::VPUNPCKLBWrm },
				{ X86::VPUNPCKLBWZ128rr , X86::VPUNPCKLBWrr },
				{ X86::VPUNPCKLDQZ128rm , X86::VPUNPCKLDQrm },
				{ X86::VPUNPCKLDQZ128rr , X86::VPUNPCKLDQrr },
				{ X86::VPUNPCKLQDQZ128rm , X86::VPUNPCKLQDQrm },
				{ X86::VPUNPCKLQDQZ128rr , X86::VPUNPCKLQDQrr },
				{ X86::VPUNPCKLWDZ128rm , X86::VPUNPCKLWDrm },
				{ X86::VPUNPCKLWDZ128rr , X86::VPUNPCKLWDrr },
				{ X86::VPXORDZ128rm , X86::VPXORrm },
				{ X86::VPXORDZ128rr , X86::VPXORrr },
				{ X86::VPXORQZ128rm , X86::VPXORrm },
				{ X86::VPXORQZ128rr , X86::VPXORrr },
				{ X86::VSHUFPDZ128rmi , X86::VSHUFPDrmi },
				{ X86::VSHUFPDZ128rri , X86::VSHUFPDrri },
				{ X86::VSHUFPSZ128rmi , X86::VSHUFPSrmi },
				{ X86::VSHUFPSZ128rri , X86::VSHUFPSrri },
				{ X86::VSQRTPDZ128m , X86::VSQRTPDm },
				{ X86::VSQRTPDZ128r , X86::VSQRTPDr },
				{ X86::VSQRTPSZ128m , X86::VSQRTPSm },
				{ X86::VSQRTPSZ128r , X86::VSQRTPSr },
				{ X86::VSUBPDZ128rm , X86::VSUBPDrm },
				{ X86::VSUBPDZ128rr , X86::VSUBPDrr },
				{ X86::VSUBPSZ128rm , X86::VSUBPSrm },
				{ X86::VSUBPSZ128rr , X86::VSUBPSrr },
				{ X86::VUNPCKHPDZ128rm , X86::VUNPCKHPDrm },
				{ X86::VUNPCKHPDZ128rr , X86::VUNPCKHPDrr },
				{ X86::VUNPCKHPSZ128rm , X86::VUNPCKHPSrm },
				{ X86::VUNPCKHPSZ128rr , X86::VUNPCKHPSrr },
				{ X86::VUNPCKLPDZ128rm , X86::VUNPCKLPDrm },
				{ X86::VUNPCKLPDZ128rr , X86::VUNPCKLPDrr },
				{ X86::VUNPCKLPSZ128rm , X86::VUNPCKLPSrm },
				{ X86::VUNPCKLPSZ128rr , X86::VUNPCKLPSrr },
				{ X86::VXORPDZ128rm , X86::VXORPDrm },
				{ X86::VXORPDZ128rr , X86::VXORPDrr },
				{ X86::VXORPSZ128rm , X86::VXORPSrm },
				{ X86::VXORPSZ128rr , X86::VXORPSrr },
				};


				// X86 EVEX encoded instructions that have a VEX 256 encoding
				// (table format: <EVEX opcode, VEX-256 opcode>).
				static const X86EvexToVexCompressTableEntry
				X86EvexToVex256CompressTable[] = {
				{ X86::VADDPDZ256rm , X86::VADDPDYrm },
				{ X86::VADDPDZ256rr , X86::VADDPDYrr },
				{ X86::VADDPSZ256rm , X86::VADDPSYrm },
				{ X86::VADDPSZ256rr , X86::VADDPSYrr },
				{ X86::VANDNPDZ256rm , X86::VANDNPDYrm },
				{ X86::VANDNPDZ256rr , X86::VANDNPDYrr },
				{ X86::VANDNPSZ256rm , X86::VANDNPSYrm },
				{ X86::VANDNPSZ256rr , X86::VANDNPSYrr },
				{ X86::VANDPDZ256rm , X86::VANDPDYrm },
				{ X86::VANDPDZ256rr , X86::VANDPDYrr },
				{ X86::VANDPSZ256rm , X86::VANDPSYrm },
				{ X86::VANDPSZ256rr , X86::VANDPSYrr },
				{ X86::VBROADCASTSDZ256m , X86::VBROADCASTSDYrm },
				{ X86::VBROADCASTSDZ256r , X86::VBROADCASTSDYrr },
				{ X86::VBROADCASTSDZ256r_s , X86::VBROADCASTSDYrr },
				{ X86::VBROADCASTSSZ256m , X86::VBROADCASTSSYrm },
				{ X86::VBROADCASTSSZ256r , X86::VBROADCASTSSYrr },
				{ X86::VBROADCASTSSZ256r_s , X86::VBROADCASTSSYrr },
				{ X86::VCVTDQ2PDZ256rm , X86::VCVTDQ2PDYrm },
				{ X86::VCVTDQ2PDZ256rr , X86::VCVTDQ2PDYrr },
				{ X86::VCVTDQ2PSZ256rm , X86::VCVTDQ2PSYrm },
				{ X86::VCVTDQ2PSZ256rr , X86::VCVTDQ2PSYrr },
				{ X86::VCVTPD2DQZ256rm , X86::VCVTPD2DQYrm },
				{ X86::VCVTPD2DQZ256rr , X86::VCVTPD2DQYrr },
				{ X86::VCVTPD2PSZ256rm , X86::VCVTPD2PSYrm },
				{ X86::VCVTPD2PSZ256rr , X86::VCVTPD2PSYrr },
				{ X86::VCVTPH2PSZ256rm , X86::VCVTPH2PSYrm },
				{ X86::VCVTPH2PSZ256rr , X86::VCVTPH2PSYrr },
				{ X86::VCVTPS2DQZ256rm , X86::VCVTPS2DQYrm },
				{ X86::VCVTPS2DQZ256rr , X86::VCVTPS2DQYrr },
				{ X86::VCVTPS2PDZ256rm , X86::VCVTPS2PDYrm },
				{ X86::VCVTPS2PDZ256rr , X86::VCVTPS2PDYrr },
				{ X86::VCVTPS2PHZ256mr , X86::VCVTPS2PHYmr },
				{ X86::VCVTPS2PHZ256rr , X86::VCVTPS2PHYrr },
				{ X86::VCVTTPD2DQZ256rm , X86::VCVTTPD2DQYrm },
				{ X86::VCVTTPD2DQZ256rr , X86::VCVTTPD2DQYrr },
				{ X86::VCVTTPS2DQZ256rm , X86::VCVTTPS2DQYrm },
				{ X86::VCVTTPS2DQZ256rr , X86::VCVTTPS2DQYrr },
				{ X86::VDIVPDZ256rm , X86::VDIVPDYrm },
				{ X86::VDIVPDZ256rr , X86::VDIVPDYrr },
				{ X86::VDIVPSZ256rm , X86::VDIVPSYrm },
				{ X86::VDIVPSZ256rr , X86::VDIVPSYrr },
				{ X86::VFMADD132PDZ256m , X86::VFMADD132PDYm },
				{ X86::VFMADD132PDZ256r , X86::VFMADD132PDYr },
				{ X86::VFMADD132PSZ256m , X86::VFMADD132PSYm },
				{ X86::VFMADD132PSZ256r , X86::VFMADD132PSYr },
				{ X86::VFMADD213PDZ256m , X86::VFMADD213PDYm },
				{ X86::VFMADD213PDZ256r , X86::VFMADD213PDYr },
				{ X86::VFMADD213PSZ256m , X86::VFMADD213PSYm },
				{ X86::VFMADD213PSZ256r , X86::VFMADD213PSYr },
				{ X86::VFMADD231PDZ256m , X86::VFMADD231PDYm },
				{ X86::VFMADD231PDZ256r , X86::VFMADD231PDYr },
				{ X86::VFMADD231PSZ256m , X86::VFMADD231PSYm },
				{ X86::VFMADD231PSZ256r , X86::VFMADD231PSYr },
				{ X86::VFMADDSUB132PDZ256m , X86::VFMADDSUB132PDYm },
				{ X86::VFMADDSUB132PDZ256r , X86::VFMADDSUB132PDYr },
				{ X86::VFMADDSUB132PSZ256m , X86::VFMADDSUB132PSYm },
				{ X86::VFMADDSUB132PSZ256r , X86::VFMADDSUB132PSYr },
				{ X86::VFMADDSUB213PDZ256m , X86::VFMADDSUB213PDYm },
				{ X86::VFMADDSUB213PDZ256r , X86::VFMADDSUB213PDYr },
				{ X86::VFMADDSUB213PSZ256m , X86::VFMADDSUB213PSYm },
				{ X86::VFMADDSUB213PSZ256r , X86::VFMADDSUB213PSYr },
				{ X86::VFMADDSUB231PDZ256m , X86::VFMADDSUB231PDYm },
				{ X86::VFMADDSUB231PDZ256r , X86::VFMADDSUB231PDYr },
				{ X86::VFMADDSUB231PSZ256m , X86::VFMADDSUB231PSYm },
				{ X86::VFMADDSUB231PSZ256r , X86::VFMADDSUB231PSYr },
				{ X86::VFMSUB132PDZ256m , X86::VFMSUB132PDYm },
				{ X86::VFMSUB132PDZ256r , X86::VFMSUB132PDYr },
				{ X86::VFMSUB132PSZ256m , X86::VFMSUB132PSYm },
				{ X86::VFMSUB132PSZ256r , X86::VFMSUB132PSYr },
				{ X86::VFMSUB213PDZ256m , X86::VFMSUB213PDYm },
				{ X86::VFMSUB213PDZ256r , X86::VFMSUB213PDYr },
				{ X86::VFMSUB213PSZ256m , X86::VFMSUB213PSYm },
				{ X86::VFMSUB213PSZ256r , X86::VFMSUB213PSYr },
				{ X86::VFMSUB231PDZ256m , X86::VFMSUB231PDYm },
				{ X86::VFMSUB231PDZ256r , X86::VFMSUB231PDYr },
				{ X86::VFMSUB231PSZ256m , X86::VFMSUB231PSYm },
				{ X86::VFMSUB231PSZ256r , X86::VFMSUB231PSYr },
				{ X86::VFMSUBADD132PDZ256m , X86::VFMSUBADD132PDYm },
				{ X86::VFMSUBADD132PDZ256r , X86::VFMSUBADD132PDYr },
				{ X86::VFMSUBADD132PSZ256m , X86::VFMSUBADD132PSYm },
				{ X86::VFMSUBADD132PSZ256r , X86::VFMSUBADD132PSYr },
				{ X86::VFMSUBADD213PDZ256m , X86::VFMSUBADD213PDYm },
				{ X86::VFMSUBADD213PDZ256r , X86::VFMSUBADD213PDYr },
				{ X86::VFMSUBADD213PSZ256m , X86::VFMSUBADD213PSYm },
				{ X86::VFMSUBADD213PSZ256r , X86::VFMSUBADD213PSYr },
				{ X86::VFMSUBADD231PDZ256m , X86::VFMSUBADD231PDYm },
				{ X86::VFMSUBADD231PDZ256r , X86::VFMSUBADD231PDYr },
				{ X86::VFMSUBADD231PSZ256m , X86::VFMSUBADD231PSYm },
				{ X86::VFMSUBADD231PSZ256r , X86::VFMSUBADD231PSYr },
				{ X86::VFNMADD132PDZ256m , X86::VFNMADD132PDYm },
				{ X86::VFNMADD132PDZ256r , X86::VFNMADD132PDYr },
				{ X86::VFNMADD132PSZ256m , X86::VFNMADD132PSYm },
				{ X86::VFNMADD132PSZ256r , X86::VFNMADD132PSYr },
				{ X86::VFNMADD213PDZ256m , X86::VFNMADD213PDYm },
				{ X86::VFNMADD213PDZ256r , X86::VFNMADD213PDYr },
				{ X86::VFNMADD213PSZ256m , X86::VFNMADD213PSYm },
				{ X86::VFNMADD213PSZ256r , X86::VFNMADD213PSYr },
				{ X86::VFNMADD231PDZ256m , X86::VFNMADD231PDYm },
				{ X86::VFNMADD231PDZ256r , X86::VFNMADD231PDYr },
				{ X86::VFNMADD231PSZ256m , X86::VFNMADD231PSYm },
				{ X86::VFNMADD231PSZ256r , X86::VFNMADD231PSYr },
				{ X86::VFNMSUB132PDZ256m , X86::VFNMSUB132PDYm },
				{ X86::VFNMSUB132PDZ256r , X86::VFNMSUB132PDYr },
				{ X86::VFNMSUB132PSZ256m , X86::VFNMSUB132PSYm },
				{ X86::VFNMSUB132PSZ256r , X86::VFNMSUB132PSYr },
				{ X86::VFNMSUB213PDZ256m , X86::VFNMSUB213PDYm },
				{ X86::VFNMSUB213PDZ256r , X86::VFNMSUB213PDYr },
				{ X86::VFNMSUB213PSZ256m , X86::VFNMSUB213PSYm },
				{ X86::VFNMSUB213PSZ256r , X86::VFNMSUB213PSYr },
				{ X86::VFNMSUB231PDZ256m , X86::VFNMSUB231PDYm },
				{ X86::VFNMSUB231PDZ256r , X86::VFNMSUB231PDYr },
				{ X86::VFNMSUB231PSZ256m , X86::VFNMSUB231PSYm },
				{ X86::VFNMSUB231PSZ256r , X86::VFNMSUB231PSYr },
				{ X86::VMAXCPDZ256rm , X86::VMAXCPDYrm },
				{ X86::VMAXCPDZ256rr , X86::VMAXCPDYrr },
				{ X86::VMAXCPSZ256rm , X86::VMAXCPSYrm },
				{ X86::VMAXCPSZ256rr , X86::VMAXCPSYrr },
				{ X86::VMAXPDZ256rm , X86::VMAXPDYrm },
				{ X86::VMAXPDZ256rr , X86::VMAXPDYrr },
				{ X86::VMAXPSZ256rm , X86::VMAXPSYrm },
				{ X86::VMAXPSZ256rr , X86::VMAXPSYrr },
				{ X86::VMINCPDZ256rm , X86::VMINCPDYrm },
				{ X86::VMINCPDZ256rr , X86::VMINCPDYrr },
				{ X86::VMINCPSZ256rm , X86::VMINCPSYrm },
				{ X86::VMINCPSZ256rr , X86::VMINCPSYrr },
				{ X86::VMINPDZ256rm , X86::VMINPDYrm },
				{ X86::VMINPDZ256rr , X86::VMINPDYrr },
				{ X86::VMINPSZ256rm , X86::VMINPSYrm },
				{ X86::VMINPSZ256rr , X86::VMINPSYrr },
				{ X86::VMOVAPDZ256mr , X86::VMOVAPDYmr },
				{ X86::VMOVAPDZ256rm , X86::VMOVAPDYrm },
				{ X86::VMOVAPDZ256rr , X86::VMOVAPDYrr },
				{ X86::VMOVAPDZ256rr_REV , X86::VMOVAPDYrr_REV },
				{ X86::VMOVAPSZ256mr , X86::VMOVAPSYmr },
				{ X86::VMOVAPSZ256rm , X86::VMOVAPSYrm },
				{ X86::VMOVAPSZ256rr , X86::VMOVAPSYrr },
				{ X86::VMOVAPSZ256rr_REV , X86::VMOVAPSYrr_REV },
				{ X86::VMOVDDUPZ256rm , X86::VMOVDDUPYrm },
				{ X86::VMOVDDUPZ256rr , X86::VMOVDDUPYrr },
				{ X86::VMOVDQA32Z256mr , X86::VMOVDQAYmr },
				{ X86::VMOVDQA32Z256rm , X86::VMOVDQAYrm },
				{ X86::VMOVDQA32Z256rr , X86::VMOVDQAYrr },
				{ X86::VMOVDQA32Z256rr_REV , X86::VMOVDQAYrr_REV },
				{ X86::VMOVDQA64Z256mr , X86::VMOVDQAYmr },
				{ X86::VMOVDQA64Z256rm , X86::VMOVDQAYrm },
				{ X86::VMOVDQA64Z256rr , X86::VMOVDQAYrr },
				{ X86::VMOVDQA64Z256rr_REV , X86::VMOVDQAYrr_REV },
				{ X86::VMOVDQU16Z256mr , X86::VMOVDQUYmr },
				{ X86::VMOVDQU16Z256rm , X86::VMOVDQUYrm },
				{ X86::VMOVDQU16Z256rr , X86::VMOVDQUYrr },
				{ X86::VMOVDQU16Z256rr_REV , X86::VMOVDQUYrr_REV },
				{ X86::VMOVDQU32Z256mr , X86::VMOVDQUYmr },
				{ X86::VMOVDQU32Z256rm , X86::VMOVDQUYrm },
				{ X86::VMOVDQU32Z256rr , X86::VMOVDQUYrr },
				{ X86::VMOVDQU32Z256rr_REV , X86::VMOVDQUYrr_REV },
				{ X86::VMOVDQU64Z256mr , X86::VMOVDQUYmr },
				{ X86::VMOVDQU64Z256rm , X86::VMOVDQUYrm },
				{ X86::VMOVDQU64Z256rr , X86::VMOVDQUYrr },
				{ X86::VMOVDQU64Z256rr_REV , X86::VMOVDQUYrr_REV },
				{ X86::VMOVDQU8Z256mr , X86::VMOVDQUYmr },
				{ X86::VMOVDQU8Z256rm , X86::VMOVDQUYrm },
				{ X86::VMOVDQU8Z256rr , X86::VMOVDQUYrr },
				{ X86::VMOVDQU8Z256rr_REV , X86::VMOVDQUYrr_REV },
				{ X86::VMOVNTDQAZ256rm , X86::VMOVNTDQAYrm },
				{ X86::VMOVNTDQZ256mr , X86::VMOVNTDQYmr },
				{ X86::VMOVNTPDZ256mr , X86::VMOVNTPDYmr },
				{ X86::VMOVNTPSZ256mr , X86::VMOVNTPSYmr },
				{ X86::VMOVSHDUPZ256rm , X86::VMOVSHDUPYrm },
				{ X86::VMOVSHDUPZ256rr , X86::VMOVSHDUPYrr },
				{ X86::VMOVSLDUPZ256rm , X86::VMOVSLDUPYrm },
				{ X86::VMOVSLDUPZ256rr , X86::VMOVSLDUPYrr },
				{ X86::VMOVUPDZ256mr , X86::VMOVUPDYmr },
				{ X86::VMOVUPDZ256rm , X86::VMOVUPDYrm },
				{ X86::VMOVUPDZ256rr , X86::VMOVUPDYrr },
				{ X86::VMOVUPDZ256rr_REV , X86::VMOVUPDYrr_REV },
				{ X86::VMOVUPSZ256mr , X86::VMOVUPSYmr },
				{ X86::VMOVUPSZ256rm , X86::VMOVUPSYrm },
				{ X86::VMOVUPSZ256rr , X86::VMOVUPSYrr },
				{ X86::VMOVUPSZ256rr_REV , X86::VMOVUPSYrr_REV },
				{ X86::VMULPDZ256rm , X86::VMULPDYrm },
				{ X86::VMULPDZ256rr , X86::VMULPDYrr },
				{ X86::VMULPSZ256rm , X86::VMULPSYrm },
				{ X86::VMULPSZ256rr , X86::VMULPSYrr },
				{ X86::VORPDZ256rm , X86::VORPDYrm },
				{ X86::VORPDZ256rr , X86::VORPDYrr },
				{ X86::VORPSZ256rm , X86::VORPSYrm },
				{ X86::VORPSZ256rr , X86::VORPSYrr },
				{ X86::VPABSBZ256rm , X86::VPABSBYrm },
				{ X86::VPABSBZ256rr , X86::VPABSBYrr },
				{ X86::VPABSDZ256rm , X86::VPABSDYrm },
				{ X86::VPABSDZ256rr , X86::VPABSDYrr },
				{ X86::VPABSWZ256rm , X86::VPABSWYrm },
				{ X86::VPABSWZ256rr , X86::VPABSWYrr },
				{ X86::VPACKSSDWZ256rm , X86::VPACKSSDWYrm },
				{ X86::VPACKSSDWZ256rr , X86::VPACKSSDWYrr },
				{ X86::VPACKSSWBZ256rm , X86::VPACKSSWBYrm },
				{ X86::VPACKSSWBZ256rr , X86::VPACKSSWBYrr },
				{ X86::VPACKUSDWZ256rm , X86::VPACKUSDWYrm },
				{ X86::VPACKUSDWZ256rr , X86::VPACKUSDWYrr },
				{ X86::VPACKUSWBZ256rm , X86::VPACKUSWBYrm },
				{ X86::VPACKUSWBZ256rr , X86::VPACKUSWBYrr },
				{ X86::VPADDBZ256rm , X86::VPADDBYrm },
				{ X86::VPADDBZ256rr , X86::VPADDBYrr },
				{ X86::VPADDDZ256rm , X86::VPADDDYrm },
				{ X86::VPADDDZ256rr , X86::VPADDDYrr },
				{ X86::VPADDQZ256rm , X86::VPADDQYrm },
				{ X86::VPADDQZ256rr , X86::VPADDQYrr },
				{ X86::VPADDSBZ256rm , X86::VPADDSBYrm },
				{ X86::VPADDSBZ256rr , X86::VPADDSBYrr },
				{ X86::VPADDSWZ256rm , X86::VPADDSWYrm },
				{ X86::VPADDSWZ256rr , X86::VPADDSWYrr },
				{ X86::VPADDUSBZ256rm , X86::VPADDUSBYrm },
				{ X86::VPADDUSBZ256rr , X86::VPADDUSBYrr },
				{ X86::VPADDUSWZ256rm , X86::VPADDUSWYrm },
				{ X86::VPADDUSWZ256rr , X86::VPADDUSWYrr },
				{ X86::VPADDWZ256rm , X86::VPADDWYrm },
				{ X86::VPADDWZ256rr , X86::VPADDWYrr },
				{ X86::VPALIGNRZ256rmi , X86::VPALIGNRYrmi },
				{ X86::VPALIGNRZ256rri , X86::VPALIGNRYrri },
				{ X86::VPANDDZ256rm , X86::VPANDYrm },
				{ X86::VPANDDZ256rr , X86::VPANDYrr },
				{ X86::VPANDQZ256rm , X86::VPANDYrm },
				{ X86::VPANDQZ256rr , X86::VPANDYrr },
				{ X86::VPAVGBZ256rm , X86::VPAVGBYrm },
				{ X86::VPAVGBZ256rr , X86::VPAVGBYrr },
				{ X86::VPAVGWZ256rm , X86::VPAVGWYrm },
				{ X86::VPAVGWZ256rr , X86::VPAVGWYrr },
				{ X86::VPBROADCASTBZ256m , X86::VPBROADCASTBYrm },
				{ X86::VPBROADCASTBZ256r , X86::VPBROADCASTBYrr },
				{ X86::VPBROADCASTDZ256m , X86::VPBROADCASTDYrm },
				{ X86::VPBROADCASTDZ256r , X86::VPBROADCASTDYrr },
				{ X86::VPBROADCASTQZ256m , X86::VPBROADCASTQYrm },
				{ X86::VPBROADCASTQZ256r , X86::VPBROADCASTQYrr },
				{ X86::VPBROADCASTWZ256m , X86::VPBROADCASTWYrm },
				{ X86::VPBROADCASTWZ256r , X86::VPBROADCASTWYrr },
				{ X86::VPERMDZ256rm , X86::VPERMDYrm },
				{ X86::VPERMDZ256rr , X86::VPERMDYrr },
				{ X86::VPERMILPDZ256mi , X86::VPERMILPDYmi },
				{ X86::VPERMILPDZ256ri , X86::VPERMILPDYri },
				{ X86::VPERMILPDZ256rm , X86::VPERMILPDYrm },
				{ X86::VPERMILPDZ256rr , X86::VPERMILPDYrr },
				{ X86::VPERMILPSZ256mi , X86::VPERMILPSYmi },
				{ X86::VPERMILPSZ256ri , X86::VPERMILPSYri },
				{ X86::VPERMILPSZ256rm , X86::VPERMILPSYrm },
				{ X86::VPERMILPSZ256rr , X86::VPERMILPSYrr },
				{ X86::VPERMPDZ256mi , X86::VPERMPDYmi },
				{ X86::VPERMPDZ256ri , X86::VPERMPDYri },
				{ X86::VPERMPSZ256rm , X86::VPERMPSYrm },
				{ X86::VPERMPSZ256rr , X86::VPERMPSYrr },
				{ X86::VPERMQZ256mi , X86::VPERMQYmi },
				{ X86::VPERMQZ256ri , X86::VPERMQYri },
				{ X86::VPMADDUBSWZ256rm , X86::VPMADDUBSWYrm },
				{ X86::VPMADDUBSWZ256rr , X86::VPMADDUBSWYrr },
				{ X86::VPMADDWDZ256rm , X86::VPMADDWDYrm },
				{ X86::VPMADDWDZ256rr , X86::VPMADDWDYrr },
				{ X86::VPMAXSBZ256rm , X86::VPMAXSBYrm },
				{ X86::VPMAXSBZ256rr , X86::VPMAXSBYrr },
				{ X86::VPMAXSDZ256rm , X86::VPMAXSDYrm },
				{ X86::VPMAXSDZ256rr , X86::VPMAXSDYrr },
				{ X86::VPMAXSWZ256rm , X86::VPMAXSWYrm },
				{ X86::VPMAXSWZ256rr , X86::VPMAXSWYrr },
				{ X86::VPMAXUBZ256rm , X86::VPMAXUBYrm },
				{ X86::VPMAXUBZ256rr , X86::VPMAXUBYrr },
				{ X86::VPMAXUDZ256rm , X86::VPMAXUDYrm },
				{ X86::VPMAXUDZ256rr , X86::VPMAXUDYrr },
				{ X86::VPMAXUWZ256rm , X86::VPMAXUWYrm },
				{ X86::VPMAXUWZ256rr , X86::VPMAXUWYrr },
				{ X86::VPMINSBZ256rm , X86::VPMINSBYrm },
				{ X86::VPMINSBZ256rr , X86::VPMINSBYrr },
				{ X86::VPMINSDZ256rm , X86::VPMINSDYrm },
				{ X86::VPMINSDZ256rr , X86::VPMINSDYrr },
				{ X86::VPMINSWZ256rm , X86::VPMINSWYrm },
				{ X86::VPMINSWZ256rr , X86::VPMINSWYrr },
				{ X86::VPMINUBZ256rm , X86::VPMINUBYrm },
				{ X86::VPMINUBZ256rr , X86::VPMINUBYrr },
				{ X86::VPMINUDZ256rm , X86::VPMINUDYrm },
				{ X86::VPMINUDZ256rr , X86::VPMINUDYrr },
				{ X86::VPMINUWZ256rm , X86::VPMINUWYrm },
				{ X86::VPMINUWZ256rr , X86::VPMINUWYrr },
				{ X86::VPMOVSXBDZ256rm , X86::VPMOVSXBDYrm },
				{ X86::VPMOVSXBDZ256rr , X86::VPMOVSXBDYrr },
				{ X86::VPMOVSXBQZ256rm , X86::VPMOVSXBQYrm },
				{ X86::VPMOVSXBQZ256rr , X86::VPMOVSXBQYrr },
				{ X86::VPMOVSXBWZ256rm , X86::VPMOVSXBWYrm },
				{ X86::VPMOVSXBWZ256rr , X86::VPMOVSXBWYrr },
				{ X86::VPMOVSXDQZ256rm , X86::VPMOVSXDQYrm },
				{ X86::VPMOVSXDQZ256rr , X86::VPMOVSXDQYrr },
				{ X86::VPMOVSXWDZ256rm , X86::VPMOVSXWDYrm },
				{ X86::VPMOVSXWDZ256rr , X86::VPMOVSXWDYrr },
				{ X86::VPMOVSXWQZ256rm , X86::VPMOVSXWQYrm },
				{ X86::VPMOVSXWQZ256rr , X86::VPMOVSXWQYrr },
				{ X86::VPMOVZXBDZ256rm , X86::VPMOVZXBDYrm },
				{ X86::VPMOVZXBDZ256rr , X86::VPMOVZXBDYrr },
				{ X86::VPMOVZXBQZ256rm , X86::VPMOVZXBQYrm },
				{ X86::VPMOVZXBQZ256rr , X86::VPMOVZXBQYrr },
				{ X86::VPMOVZXBWZ256rm , X86::VPMOVZXBWYrm },
				{ X86::VPMOVZXBWZ256rr , X86::VPMOVZXBWYrr },
				{ X86::VPMOVZXDQZ256rm , X86::VPMOVZXDQYrm },
				{ X86::VPMOVZXDQZ256rr , X86::VPMOVZXDQYrr },
				{ X86::VPMOVZXWDZ256rm , X86::VPMOVZXWDYrm },
				{ X86::VPMOVZXWDZ256rr , X86::VPMOVZXWDYrr },
				{ X86::VPMOVZXWQZ256rm , X86::VPMOVZXWQYrm },
				{ X86::VPMOVZXWQZ256rr , X86::VPMOVZXWQYrr },
				{ X86::VPMULDQZ256rm , X86::VPMULDQYrm },
				{ X86::VPMULDQZ256rr , X86::VPMULDQYrr },
				{ X86::VPMULHRSWZ256rm , X86::VPMULHRSWYrm },
				{ X86::VPMULHRSWZ256rr , X86::VPMULHRSWYrr },
				{ X86::VPMULHUWZ256rm , X86::VPMULHUWYrm },
				{ X86::VPMULHUWZ256rr , X86::VPMULHUWYrr },
				{ X86::VPMULHWZ256rm , X86::VPMULHWYrm },
				{ X86::VPMULHWZ256rr , X86::VPMULHWYrr },
				{ X86::VPMULLDZ256rm , X86::VPMULLDYrm },
				{ X86::VPMULLDZ256rr , X86::VPMULLDYrr },
				{ X86::VPMULLWZ256rm , X86::VPMULLWYrm },
				{ X86::VPMULLWZ256rr , X86::VPMULLWYrr },
				{ X86::VPMULUDQZ256rm , X86::VPMULUDQYrm },
				{ X86::VPMULUDQZ256rr , X86::VPMULUDQYrr },
				{ X86::VPORDZ256rm , X86::VPORYrm },
				{ X86::VPORDZ256rr , X86::VPORYrr },
				{ X86::VPORQZ256rm , X86::VPORYrm },
				{ X86::VPORQZ256rr , X86::VPORYrr },
				{ X86::VPSADBWZ256rm , X86::VPSADBWYrm },
				{ X86::VPSADBWZ256rr , X86::VPSADBWYrr },
				{ X86::VPSHUFBZ256rm , X86::VPSHUFBYrm },
				{ X86::VPSHUFBZ256rr , X86::VPSHUFBYrr },
				{ X86::VPSHUFDZ256mi , X86::VPSHUFDYmi },
				{ X86::VPSHUFDZ256ri , X86::VPSHUFDYri },
				{ X86::VPSHUFHWZ256mi , X86::VPSHUFHWYmi },
				{ X86::VPSHUFHWZ256ri , X86::VPSHUFHWYri },
				{ X86::VPSHUFLWZ256mi , X86::VPSHUFLWYmi },
				{ X86::VPSHUFLWZ256ri , X86::VPSHUFLWYri },
				{ X86::VPSLLDQZ256rr , X86::VPSLLDQYri },
				{ X86::VPSLLDZ256ri , X86::VPSLLDYri },
				{ X86::VPSLLDZ256rm , X86::VPSLLDYrm },
				{ X86::VPSLLDZ256rr , X86::VPSLLDYrr },
				{ X86::VPSLLQZ256ri , X86::VPSLLQYri },
				{ X86::VPSLLQZ256rm , X86::VPSLLQYrm },
				{ X86::VPSLLQZ256rr , X86::VPSLLQYrr },
				{ X86::VPSLLVDZ256rm , X86::VPSLLVDYrm },
				{ X86::VPSLLVDZ256rr , X86::VPSLLVDYrr },
				{ X86::VPSLLVQZ256rm , X86::VPSLLVQYrm },
				{ X86::VPSLLVQZ256rr , X86::VPSLLVQYrr },
				{ X86::VPSLLWZ256ri , X86::VPSLLWYri },
				{ X86::VPSLLWZ256rm , X86::VPSLLWYrm },
				{ X86::VPSLLWZ256rr , X86::VPSLLWYrr },
				{ X86::VPSRADZ256ri , X86::VPSRADYri },
				{ X86::VPSRADZ256rm , X86::VPSRADYrm },
				{ X86::VPSRADZ256rr , X86::VPSRADYrr },
				{ X86::VPSRAVDZ256rm , X86::VPSRAVDYrm },
				{ X86::VPSRAVDZ256rr , X86::VPSRAVDYrr },
				{ X86::VPSRAWZ256ri , X86::VPSRAWYri },
				{ X86::VPSRAWZ256rm , X86::VPSRAWYrm },
				{ X86::VPSRAWZ256rr , X86::VPSRAWYrr },
				{ X86::VPSRLDQZ256rr , X86::VPSRLDQYri },
				{ X86::VPSRLDZ256ri , X86::VPSRLDYri },
				{ X86::VPSRLDZ256rm , X86::VPSRLDYrm },
				{ X86::VPSRLDZ256rr , X86::VPSRLDYrr },
				{ X86::VPSRLQZ256ri , X86::VPSRLQYri },
				{ X86::VPSRLQZ256rm , X86::VPSRLQYrm },
				{ X86::VPSRLQZ256rr , X86::VPSRLQYrr },
				{ X86::VPSRLVDZ256rm , X86::VPSRLVDYrm },
				{ X86::VPSRLVDZ256rr , X86::VPSRLVDYrr },
				{ X86::VPSRLVQZ256rm , X86::VPSRLVQYrm },
				{ X86::VPSRLVQZ256rr , X86::VPSRLVQYrr },
				{ X86::VPSRLWZ256ri , X86::VPSRLWYri },
				{ X86::VPSRLWZ256rm , X86::VPSRLWYrm },
				{ X86::VPSRLWZ256rr , X86::VPSRLWYrr },
				{ X86::VPSUBBZ256rm , X86::VPSUBBYrm },
				{ X86::VPSUBBZ256rr , X86::VPSUBBYrr },
				{ X86::VPSUBDZ256rm , X86::VPSUBDYrm },
				{ X86::VPSUBDZ256rr , X86::VPSUBDYrr },
				{ X86::VPSUBQZ256rm , X86::VPSUBQYrm },
				{ X86::VPSUBQZ256rr , X86::VPSUBQYrr },
				{ X86::VPSUBSBZ256rm , X86::VPSUBSBYrm },
				{ X86::VPSUBSBZ256rr , X86::VPSUBSBYrr },
				{ X86::VPSUBSWZ256rm , X86::VPSUBSWYrm },
				{ X86::VPSUBSWZ256rr , X86::VPSUBSWYrr },
				{ X86::VPSUBUSBZ256rm , X86::VPSUBUSBYrm },
				{ X86::VPSUBUSBZ256rr , X86::VPSUBUSBYrr },
				{ X86::VPSUBUSWZ256rm , X86::VPSUBUSWYrm },
				{ X86::VPSUBUSWZ256rr , X86::VPSUBUSWYrr },
				{ X86::VPSUBWZ256rm , X86::VPSUBWYrm },
				{ X86::VPSUBWZ256rr , X86::VPSUBWYrr },
				{ X86::VPUNPCKHBWZ256rm , X86::VPUNPCKHBWYrm },
				{ X86::VPUNPCKHBWZ256rr , X86::VPUNPCKHBWYrr },
				{ X86::VPUNPCKHDQZ256rm , X86::VPUNPCKHDQYrm },
				{ X86::VPUNPCKHDQZ256rr , X86::VPUNPCKHDQYrr },
				{ X86::VPUNPCKHQDQZ256rm , X86::VPUNPCKHQDQYrm },
				{ X86::VPUNPCKHQDQZ256rr , X86::VPUNPCKHQDQYrr },
				{ X86::VPUNPCKHWDZ256rm , X86::VPUNPCKHWDYrm },
				{ X86::VPUNPCKHWDZ256rr , X86::VPUNPCKHWDYrr },
				{ X86::VPUNPCKLBWZ256rm , X86::VPUNPCKLBWYrm },
				{ X86::VPUNPCKLBWZ256rr , X86::VPUNPCKLBWYrr },
				{ X86::VPUNPCKLDQZ256rm , X86::VPUNPCKLDQYrm },
				{ X86::VPUNPCKLDQZ256rr , X86::VPUNPCKLDQYrr },
				{ X86::VPUNPCKLQDQZ256rm , X86::VPUNPCKLQDQYrm },
				{ X86::VPUNPCKLQDQZ256rr , X86::VPUNPCKLQDQYrr },
				{ X86::VPUNPCKLWDZ256rm , X86::VPUNPCKLWDYrm },
				{ X86::VPUNPCKLWDZ256rr , X86::VPUNPCKLWDYrr },
				{ X86::VPXORDZ256rm , X86::VPXORYrm },
				{ X86::VPXORDZ256rr , X86::VPXORYrr },
				{ X86::VPXORQZ256rm , X86::VPXORYrm },
				{ X86::VPXORQZ256rr , X86::VPXORYrr },
				{ X86::VSHUFPDZ256rmi , X86::VSHUFPDYrmi },
				{ X86::VSHUFPDZ256rri , X86::VSHUFPDYrri },
				{ X86::VSHUFPSZ256rmi , X86::VSHUFPSYrmi },
				{ X86::VSHUFPSZ256rri , X86::VSHUFPSYrri },
				{ X86::VSQRTPDZ256m , X86::VSQRTPDYm },
				{ X86::VSQRTPDZ256r , X86::VSQRTPDYr },
				{ X86::VSQRTPSZ256m , X86::VSQRTPSYm },
				{ X86::VSQRTPSZ256r , X86::VSQRTPSYr },
				{ X86::VSUBPDZ256rm , X86::VSUBPDYrm },
				{ X86::VSUBPDZ256rr , X86::VSUBPDYrr },
				{ X86::VSUBPSZ256rm , X86::VSUBPSYrm },
				{ X86::VSUBPSZ256rr , X86::VSUBPSYrr },
				{ X86::VUNPCKHPDZ256rm , X86::VUNPCKHPDYrm },
				{ X86::VUNPCKHPDZ256rr , X86::VUNPCKHPDYrr },
				{ X86::VUNPCKHPSZ256rm , X86::VUNPCKHPSYrm },
				{ X86::VUNPCKHPSZ256rr , X86::VUNPCKHPSYrr },
				{ X86::VUNPCKLPDZ256rm , X86::VUNPCKLPDYrm },
				{ X86::VUNPCKLPDZ256rr , X86::VUNPCKLPDYrr },
				{ X86::VUNPCKLPSZ256rm , X86::VUNPCKLPSYrm },
				{ X86::VUNPCKLPSZ256rr , X86::VUNPCKLPSYrr },
				{ X86::VXORPDZ256rm , X86::VXORPDYrm },
				{ X86::VXORPDZ256rr , X86::VXORPDYrr },
				{ X86::VXORPSZ256rm , X86::VXORPSYrm },
				{ X86::VXORPSZ256rr , X86::VXORPSYrr },
				};

				#endif
				No newline at end of file

llvm/trunk/lib/Target/X86/X86MCInstLower.cpp

Show All 10 Lines
// MCInst records.		// MCInst records.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "X86AsmPrinter.h"		#include "X86AsmPrinter.h"
#include "X86RegisterInfo.h"		#include "X86RegisterInfo.h"
#include "X86ShuffleDecodeConstantPool.h"		#include "X86ShuffleDecodeConstantPool.h"
#include "InstPrinter/X86ATTInstPrinter.h"		#include "InstPrinter/X86ATTInstPrinter.h"
		#include "InstPrinter/X86InstComments.h"
#include "MCTargetDesc/X86BaseInfo.h"		#include "MCTargetDesc/X86BaseInfo.h"
#include "Utils/X86ShuffleDecode.h"		#include "Utils/X86ShuffleDecode.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/iterator_range.h"		#include "llvm/ADT/iterator_range.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineConstantPool.h"		#include "llvm/CodeGen/MachineConstantPool.h"
#include "llvm/CodeGen/MachineOperand.h"		#include "llvm/CodeGen/MachineOperand.h"
▲ Show 20 Lines • Show All 1,258 Lines • ▼ Show 20 Lines	static std::string getShuffleComment(const MachineInstr *MI,

return Comment;		return Comment;
}		}

void X86AsmPrinter::EmitInstruction(const MachineInstr *MI) {		void X86AsmPrinter::EmitInstruction(const MachineInstr *MI) {
X86MCInstLower MCInstLowering(MF, this);		X86MCInstLower MCInstLowering(MF, this);
const X86RegisterInfo *RI = MF->getSubtarget<X86Subtarget>().getRegisterInfo();		const X86RegisterInfo *RI = MF->getSubtarget<X86Subtarget>().getRegisterInfo();

		// Add a comment about EVEX-2-VEX compression for AVX-512 instrs that
		// are compressed from EVEX encoding to VEX encoding.
		if (TM.Options.MCOptions.ShowMCEncoding) {
		if (MI->getAsmPrinterFlags() & AC_EVEX_2_VEX)
		OutStreamer->AddComment("EVEX TO VEX Compression ", false);
		}

switch (MI->getOpcode()) {		switch (MI->getOpcode()) {
case TargetOpcode::DBG_VALUE:		case TargetOpcode::DBG_VALUE:
llvm_unreachable("Should be handled target independently");		llvm_unreachable("Should be handled target independently");

// Emit nothing here but a comment if we can.		// Emit nothing here but a comment if we can.
case X86::Int_MemBarrier:		case X86::Int_MemBarrier:
OutStreamer->emitRawComment("MEMBARRIER");		OutStreamer->emitRawComment("MEMBARRIER");
return;		return;
▲ Show 20 Lines • Show All 487 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86TargetMachine.cpp

Show All 40 Lines	extern "C" void LLVMInitializeX86Target() {
// Register the target.		// Register the target.
RegisterTargetMachine<X86TargetMachine> X(getTheX86_32Target());		RegisterTargetMachine<X86TargetMachine> X(getTheX86_32Target());
RegisterTargetMachine<X86TargetMachine> Y(getTheX86_64Target());		RegisterTargetMachine<X86TargetMachine> Y(getTheX86_64Target());

PassRegistry &PR = *PassRegistry::getPassRegistry();		PassRegistry &PR = *PassRegistry::getPassRegistry();
initializeGlobalISel(PR);		initializeGlobalISel(PR);
initializeWinEHStatePassPass(PR);		initializeWinEHStatePassPass(PR);
initializeFixupBWInstPassPass(PR);		initializeFixupBWInstPassPass(PR);
		initializeEvexToVexInstPassPass(PR);
}		}

static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {		static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {
if (TT.isOSBinFormatMachO()) {		if (TT.isOSBinFormatMachO()) {
if (TT.getArch() == Triple::x86_64)		if (TT.getArch() == Triple::x86_64)
return make_unique<X86_64MachoTargetObjectFile>();		return make_unique<X86_64MachoTargetObjectFile>();
return make_unique<TargetLoweringObjectFileMachO>();		return make_unique<TargetLoweringObjectFileMachO>();
}		}
▲ Show 20 Lines • Show All 337 Lines • ▼ Show 20 Lines	void X86PassConfig::addPreEmitPass() {

if (UseVZeroUpper)		if (UseVZeroUpper)
addPass(createX86IssueVZeroUpperPass());		addPass(createX86IssueVZeroUpperPass());

if (getOptLevel() != CodeGenOpt::None) {		if (getOptLevel() != CodeGenOpt::None) {
addPass(createX86FixupBWInsts());		addPass(createX86FixupBWInsts());
addPass(createX86PadShortFunctions());		addPass(createX86PadShortFunctions());
addPass(createX86FixupLEAs());		addPass(createX86FixupLEAs());
		addPass(createX86EvexToVexInsts());
}		}
}		}

llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll

	Show First 20 Lines • Show All 96 Lines • ▼ Show 20 Lines
	; AVX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_comieq_sd:			; AVX512VL-LABEL: test_x86_sse2_comieq_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc1]			; AVX512VL-NEXT: vcomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX512VL-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX512VL-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX512VL-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX512VL-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX512VL-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX512VL-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comieq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comieq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comieq.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comieq.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_comige_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_comige_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_comige_sd:			; AVX-LABEL: test_x86_sse2_comige_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]			; AVX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_comige_sd:			; AVX512VL-LABEL: test_x86_sse2_comige_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc1]			; AVX512VL-NEXT: vcomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_comigt_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_comigt_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_comigt_sd:			; AVX-LABEL: test_x86_sse2_comigt_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]			; AVX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_comigt_sd:			; AVX512VL-LABEL: test_x86_sse2_comigt_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc1]			; AVX512VL-NEXT: vcomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_comile_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_comile_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_comile_sd:			; AVX-LABEL: test_x86_sse2_comile_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2f,0xc8]			; AVX-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2f,0xc8]
	; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_comile_sd:			; AVX512VL-LABEL: test_x86_sse2_comile_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc8]			; AVX512VL-NEXT: vcomisd %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc8]
	; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_comilt_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_comilt_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_comilt_sd:			; AVX-LABEL: test_x86_sse2_comilt_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2f,0xc8]			; AVX-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2f,0xc8]
	; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_comilt_sd:			; AVX512VL-LABEL: test_x86_sse2_comilt_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc8]			; AVX512VL-NEXT: vcomisd %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc8]
	; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_comineq_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_comineq_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_comineq_sd:			; AVX-LABEL: test_x86_sse2_comineq_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]			; AVX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_comineq_sd:			; AVX512VL-LABEL: test_x86_sse2_comineq_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc1]			; AVX512VL-NEXT: vcomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX512VL-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX512VL-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX512VL-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX512VL-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX512VL-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX512VL-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comineq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comineq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comineq.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comineq.sd(<2 x double>, <2 x double>) nounwind readnone


	define <4 x float> @test_x86_sse2_cvtdq2ps(<4 x i32> %a0) {			define <4 x float> @test_x86_sse2_cvtdq2ps(<4 x i32> %a0) {
	; AVX-LABEL: test_x86_sse2_cvtdq2ps:			; AVX-LABEL: test_x86_sse2_cvtdq2ps:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtdq2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5b,0xc0]			; AVX-NEXT: vcvtdq2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5b,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_cvtdq2ps:			; AVX512VL-LABEL: test_x86_sse2_cvtdq2ps:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtdq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5b,0xc0]			; AVX512VL-NEXT: vcvtdq2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5b,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse2.cvtdq2ps(<4 x i32> %a0) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse2.cvtdq2ps(<4 x i32> %a0) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse2.cvtdq2ps(<4 x i32>) nounwind readnone			declare <4 x float> @llvm.x86.sse2.cvtdq2ps(<4 x i32>) nounwind readnone


	define <4 x i32> @test_x86_sse2_cvtpd2dq(<2 x double> %a0) {			define <4 x i32> @test_x86_sse2_cvtpd2dq(<2 x double> %a0) {
	; AVX-LABEL: test_x86_sse2_cvtpd2dq:			; AVX-LABEL: test_x86_sse2_cvtpd2dq:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0xe6,0xc0]			; AVX-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0xe6,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_cvtpd2dq:			; AVX512VL-LABEL: test_x86_sse2_cvtpd2dq:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0xe6,0xc0]			; AVX512VL-NEXT: vcvtpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0xe6,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>) nounwind readnone


	define <4 x float> @test_x86_sse2_cvtpd2ps(<2 x double> %a0) {			define <4 x float> @test_x86_sse2_cvtpd2ps(<2 x double> %a0) {
	; AVX-LABEL: test_x86_sse2_cvtpd2ps:			; AVX-LABEL: test_x86_sse2_cvtpd2ps:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5a,0xc0]			; AVX-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5a,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_cvtpd2ps:			; AVX512VL-LABEL: test_x86_sse2_cvtpd2ps:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5a,0xc0]			; AVX512VL-NEXT: vcvtpd2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5a,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double> %a0) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double> %a0) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double>) nounwind readnone			declare <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double>) nounwind readnone


	define <4 x i32> @test_x86_sse2_cvtps2dq(<4 x float> %a0) {			define <4 x i32> @test_x86_sse2_cvtps2dq(<4 x float> %a0) {
	Show All 10 Lines
	define i32 @test_x86_sse2_cvtsd2si(<2 x double> %a0) {			define i32 @test_x86_sse2_cvtsd2si(<2 x double> %a0) {
	; AVX-LABEL: test_x86_sse2_cvtsd2si:			; AVX-LABEL: test_x86_sse2_cvtsd2si:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtsd2si %xmm0, %eax ## encoding: [0xc5,0xfb,0x2d,0xc0]			; AVX-NEXT: vcvtsd2si %xmm0, %eax ## encoding: [0xc5,0xfb,0x2d,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_cvtsd2si:			; AVX512VL-LABEL: test_x86_sse2_cvtsd2si:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtsd2si %xmm0, %eax ## encoding: [0x62,0xf1,0x7f,0x08,0x2d,0xc0]			; AVX512VL-NEXT: vcvtsd2si %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x2d,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.cvtsd2si(<2 x double> %a0) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.cvtsd2si(<2 x double> %a0) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.cvtsd2si(<2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.cvtsd2si(<2 x double>) nounwind readnone


	define <4 x float> @test_x86_sse2_cvtsd2ss(<4 x float> %a0, <2 x double> %a1) {			define <4 x float> @test_x86_sse2_cvtsd2ss(<4 x float> %a0, <2 x double> %a1) {
	Show All 10 Lines
	define <2 x double> @test_x86_sse2_cvtsi2sd(<2 x double> %a0, i32 %a1) {			define <2 x double> @test_x86_sse2_cvtsi2sd(<2 x double> %a0, i32 %a1) {
	; AVX-LABEL: test_x86_sse2_cvtsi2sd:			; AVX-LABEL: test_x86_sse2_cvtsi2sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtsi2sdl {{[0-9]+}}(%esp), %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0x2a,0x44,0x24,0x04]			; AVX-NEXT: vcvtsi2sdl {{[0-9]+}}(%esp), %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0x2a,0x44,0x24,0x04]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_cvtsi2sd:			; AVX512VL-LABEL: test_x86_sse2_cvtsi2sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtsi2sdl {{[0-9]+}}(%esp), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x2a,0x44,0x24,0x01]			; AVX512VL-NEXT: vcvtsi2sdl {{[0-9]+}}(%esp), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x2a,0x44,0x24,0x04]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double> %a0, i32 %a1) ; <<2 x double>> [#uses=1]			%res = call <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double> %a0, i32 %a1) ; <<2 x double>> [#uses=1]
	ret <2 x double> %res			ret <2 x double> %res
	}			}
	declare <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double>, i32) nounwind readnone			declare <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double>, i32) nounwind readnone


	define <2 x double> @test_x86_sse2_cvtss2sd(<2 x double> %a0, <4 x float> %a1) {			define <2 x double> @test_x86_sse2_cvtss2sd(<2 x double> %a0, <4 x float> %a1) {
	Show All 10 Lines
	define <4 x i32> @test_x86_sse2_cvttpd2dq(<2 x double> %a0) {			define <4 x i32> @test_x86_sse2_cvttpd2dq(<2 x double> %a0) {
	; AVX-LABEL: test_x86_sse2_cvttpd2dq:			; AVX-LABEL: test_x86_sse2_cvttpd2dq:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe6,0xc0]			; AVX-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe6,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_cvttpd2dq:			; AVX512VL-LABEL: test_x86_sse2_cvttpd2dq:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xe6,0xc0]			; AVX512VL-NEXT: vcvttpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe6,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double>) nounwind readnone


	define <4 x i32> @test_x86_sse2_cvttps2dq(<4 x float> %a0) {			define <4 x i32> @test_x86_sse2_cvttps2dq(<4 x float> %a0) {
	; AVX-LABEL: test_x86_sse2_cvttps2dq:			; AVX-LABEL: test_x86_sse2_cvttps2dq:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvttps2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfa,0x5b,0xc0]			; AVX-NEXT: vcvttps2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfa,0x5b,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_cvttps2dq:			; AVX512VL-LABEL: test_x86_sse2_cvttps2dq:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvttps2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x5b,0xc0]			; AVX512VL-NEXT: vcvttps2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x5b,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float>) nounwind readnone


	define i32 @test_x86_sse2_cvttsd2si(<2 x double> %a0) {			define i32 @test_x86_sse2_cvttsd2si(<2 x double> %a0) {
	; AVX-LABEL: test_x86_sse2_cvttsd2si:			; AVX-LABEL: test_x86_sse2_cvttsd2si:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvttsd2si %xmm0, %eax ## encoding: [0xc5,0xfb,0x2c,0xc0]			; AVX-NEXT: vcvttsd2si %xmm0, %eax ## encoding: [0xc5,0xfb,0x2c,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_cvttsd2si:			; AVX512VL-LABEL: test_x86_sse2_cvttsd2si:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvttsd2si %xmm0, %eax ## encoding: [0x62,0xf1,0x7f,0x08,0x2c,0xc0]			; AVX512VL-NEXT: vcvttsd2si %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x2c,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.cvttsd2si(<2 x double> %a0) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.cvttsd2si(<2 x double> %a0) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.cvttsd2si(<2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.cvttsd2si(<2 x double>) nounwind readnone



	define <2 x double> @test_x86_sse2_max_pd(<2 x double> %a0, <2 x double> %a1) {			define <2 x double> @test_x86_sse2_max_pd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_max_pd:			; AVX-LABEL: test_x86_sse2_max_pd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vmaxpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5f,0xc1]			; AVX-NEXT: vmaxpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5f,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_max_pd:			; AVX512VL-LABEL: test_x86_sse2_max_pd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vmaxpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5f,0xc1]			; AVX512VL-NEXT: vmaxpd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5f,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.sse2.max.pd(<2 x double> %a0, <2 x double> %a1) ; <<2 x double>> [#uses=1]			%res = call <2 x double> @llvm.x86.sse2.max.pd(<2 x double> %a0, <2 x double> %a1) ; <<2 x double>> [#uses=1]
	ret <2 x double> %res			ret <2 x double> %res
	}			}
	declare <2 x double> @llvm.x86.sse2.max.pd(<2 x double>, <2 x double>) nounwind readnone			declare <2 x double> @llvm.x86.sse2.max.pd(<2 x double>, <2 x double>) nounwind readnone


	define <2 x double> @test_x86_sse2_max_sd(<2 x double> %a0, <2 x double> %a1) {			define <2 x double> @test_x86_sse2_max_sd(<2 x double> %a0, <2 x double> %a1) {
	Show All 10 Lines
	define <2 x double> @test_x86_sse2_min_pd(<2 x double> %a0, <2 x double> %a1) {			define <2 x double> @test_x86_sse2_min_pd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_min_pd:			; AVX-LABEL: test_x86_sse2_min_pd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vminpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5d,0xc1]			; AVX-NEXT: vminpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5d,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_min_pd:			; AVX512VL-LABEL: test_x86_sse2_min_pd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vminpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5d,0xc1]			; AVX512VL-NEXT: vminpd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5d,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.sse2.min.pd(<2 x double> %a0, <2 x double> %a1) ; <<2 x double>> [#uses=1]			%res = call <2 x double> @llvm.x86.sse2.min.pd(<2 x double> %a0, <2 x double> %a1) ; <<2 x double>> [#uses=1]
	ret <2 x double> %res			ret <2 x double> %res
	}			}
	declare <2 x double> @llvm.x86.sse2.min.pd(<2 x double>, <2 x double>) nounwind readnone			declare <2 x double> @llvm.x86.sse2.min.pd(<2 x double>, <2 x double>) nounwind readnone


	define <2 x double> @test_x86_sse2_min_sd(<2 x double> %a0, <2 x double> %a1) {			define <2 x double> @test_x86_sse2_min_sd(<2 x double> %a0, <2 x double> %a1) {
	Show All 23 Lines
	define <8 x i16> @test_x86_sse2_packssdw_128(<4 x i32> %a0, <4 x i32> %a1) {			define <8 x i16> @test_x86_sse2_packssdw_128(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse2_packssdw_128:			; AVX-LABEL: test_x86_sse2_packssdw_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x6b,0xc1]			; AVX-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x6b,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_packssdw_128:			; AVX512VL-LABEL: test_x86_sse2_packssdw_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6b,0xc1]			; AVX512VL-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6b,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.packssdw.128(<4 x i32> %a0, <4 x i32> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.packssdw.128(<4 x i32> %a0, <4 x i32> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.packssdw.128(<4 x i32>, <4 x i32>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.packssdw.128(<4 x i32>, <4 x i32>) nounwind readnone


	define <16 x i8> @test_x86_sse2_packsswb_128(<8 x i16> %a0, <8 x i16> %a1) {			define <16 x i8> @test_x86_sse2_packsswb_128(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_packsswb_128:			; AVX-LABEL: test_x86_sse2_packsswb_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x63,0xc1]			; AVX-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x63,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_packsswb_128:			; AVX512VL-LABEL: test_x86_sse2_packsswb_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x63,0xc1]			; AVX512VL-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x63,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.packsswb.128(<8 x i16> %a0, <8 x i16> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.packsswb.128(<8 x i16> %a0, <8 x i16> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.packsswb.128(<8 x i16>, <8 x i16>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.packsswb.128(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_packuswb_128(<8 x i16> %a0, <8 x i16> %a1) {			define <16 x i8> @test_x86_sse2_packuswb_128(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_packuswb_128:			; AVX-LABEL: test_x86_sse2_packuswb_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x67,0xc1]			; AVX-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x67,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_packuswb_128:			; AVX512VL-LABEL: test_x86_sse2_packuswb_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x67,0xc1]			; AVX512VL-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x67,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.packuswb.128(<8 x i16> %a0, <8 x i16> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.packuswb.128(<8 x i16> %a0, <8 x i16> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.packuswb.128(<8 x i16>, <8 x i16>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.packuswb.128(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_padds_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_padds_b(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse2_padds_b:			; AVX-LABEL: test_x86_sse2_padds_b:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xec,0xc1]			; AVX-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xec,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_padds_b:			; AVX512VL-LABEL: test_x86_sse2_padds_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xec,0xc1]			; AVX512VL-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xec,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.padds.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.padds.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.padds.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.padds.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_padds_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_padds_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_padds_w:			; AVX-LABEL: test_x86_sse2_padds_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xed,0xc1]			; AVX-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xed,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_padds_w:			; AVX512VL-LABEL: test_x86_sse2_padds_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xed,0xc1]			; AVX512VL-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xed,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.padds.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.padds.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.padds.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.padds.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_paddus_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_paddus_b(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse2_paddus_b:			; AVX-LABEL: test_x86_sse2_paddus_b:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xdc,0xc1]			; AVX-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xdc,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_paddus_b:			; AVX512VL-LABEL: test_x86_sse2_paddus_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdc,0xc1]			; AVX512VL-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdc,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.paddus.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.paddus.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.paddus.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.paddus.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_paddus_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_paddus_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_paddus_w:			; AVX-LABEL: test_x86_sse2_paddus_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xdd,0xc1]			; AVX-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xdd,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_paddus_w:			; AVX512VL-LABEL: test_x86_sse2_paddus_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdd,0xc1]			; AVX512VL-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdd,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.paddus.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.paddus.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.paddus.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.paddus.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_pavg_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_pavg_b(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse2_pavg_b:			; AVX-LABEL: test_x86_sse2_pavg_b:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe0,0xc1]			; AVX-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe0,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pavg_b:			; AVX512VL-LABEL: test_x86_sse2_pavg_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe0,0xc1]			; AVX512VL-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe0,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.pavg.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.pavg.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.pavg.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.pavg.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_pavg_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pavg_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_pavg_w:			; AVX-LABEL: test_x86_sse2_pavg_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe3,0xc1]			; AVX-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe3,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pavg_w:			; AVX512VL-LABEL: test_x86_sse2_pavg_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe3,0xc1]			; AVX512VL-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe3,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_sse2_pmadd_wd(<8 x i16> %a0, <8 x i16> %a1) {			define <4 x i32> @test_x86_sse2_pmadd_wd(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_pmadd_wd:			; AVX-LABEL: test_x86_sse2_pmadd_wd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf5,0xc1]			; AVX-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf5,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pmadd_wd:			; AVX512VL-LABEL: test_x86_sse2_pmadd_wd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf5,0xc1]			; AVX512VL-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf5,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a0, <8 x i16> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a0, <8 x i16> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>) nounwind readnone


	define <8 x i16> @test_x86_sse2_pmaxs_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pmaxs_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_pmaxs_w:			; AVX-LABEL: test_x86_sse2_pmaxs_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xee,0xc1]			; AVX-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xee,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pmaxs_w:			; AVX512VL-LABEL: test_x86_sse2_pmaxs_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xee,0xc1]			; AVX512VL-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xee,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_pmaxu_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_pmaxu_b(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse2_pmaxu_b:			; AVX-LABEL: test_x86_sse2_pmaxu_b:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xde,0xc1]			; AVX-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xde,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pmaxu_b:			; AVX512VL-LABEL: test_x86_sse2_pmaxu_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xde,0xc1]			; AVX512VL-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xde,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_pmins_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pmins_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_pmins_w:			; AVX-LABEL: test_x86_sse2_pmins_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpminsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xea,0xc1]			; AVX-NEXT: vpminsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xea,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pmins_w:			; AVX512VL-LABEL: test_x86_sse2_pmins_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xea,0xc1]			; AVX512VL-NEXT: vpminsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xea,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_pminu_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_pminu_b(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse2_pminu_b:			; AVX-LABEL: test_x86_sse2_pminu_b:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpminub %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xda,0xc1]			; AVX-NEXT: vpminub %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xda,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pminu_b:			; AVX512VL-LABEL: test_x86_sse2_pminu_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminub %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xda,0xc1]			; AVX512VL-NEXT: vpminub %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xda,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8>, <16 x i8>) nounwind readnone


	define i32 @test_x86_sse2_pmovmskb_128(<16 x i8> %a0) {			define i32 @test_x86_sse2_pmovmskb_128(<16 x i8> %a0) {
	Show All 10 Lines
	define <8 x i16> @test_x86_sse2_pmulh_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pmulh_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_pmulh_w:			; AVX-LABEL: test_x86_sse2_pmulh_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe5,0xc1]			; AVX-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe5,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pmulh_w:			; AVX512VL-LABEL: test_x86_sse2_pmulh_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe5,0xc1]			; AVX512VL-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe5,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pmulh.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pmulh.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pmulh.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pmulh.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <8 x i16> @test_x86_sse2_pmulhu_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pmulhu_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_pmulhu_w:			; AVX-LABEL: test_x86_sse2_pmulhu_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe4,0xc1]			; AVX-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe4,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pmulhu_w:			; AVX512VL-LABEL: test_x86_sse2_pmulhu_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe4,0xc1]			; AVX512VL-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe4,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pmulhu.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pmulhu.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pmulhu.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pmulhu.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <2 x i64> @test_x86_sse2_pmulu_dq(<4 x i32> %a0, <4 x i32> %a1) {			define <2 x i64> @test_x86_sse2_pmulu_dq(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse2_pmulu_dq:			; AVX-LABEL: test_x86_sse2_pmulu_dq:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf4,0xc1]			; AVX-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf4,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pmulu_dq:			; AVX512VL-LABEL: test_x86_sse2_pmulu_dq:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xf4,0xc1]			; AVX512VL-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf4,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.pmulu.dq(<4 x i32> %a0, <4 x i32> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.pmulu.dq(<4 x i32> %a0, <4 x i32> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.pmulu.dq(<4 x i32>, <4 x i32>) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.pmulu.dq(<4 x i32>, <4 x i32>) nounwind readnone


	define <2 x i64> @test_x86_sse2_psad_bw(<16 x i8> %a0, <16 x i8> %a1) {			define <2 x i64> @test_x86_sse2_psad_bw(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse2_psad_bw:			; AVX-LABEL: test_x86_sse2_psad_bw:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf6,0xc1]			; AVX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psad_bw:			; AVX512VL-LABEL: test_x86_sse2_psad_bw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf6,0xc1]			; AVX512VL-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.psad.bw(<16 x i8> %a0, <16 x i8> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.psad.bw(<16 x i8> %a0, <16 x i8> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.psad.bw(<16 x i8>, <16 x i8>) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.psad.bw(<16 x i8>, <16 x i8>) nounwind readnone


	define <4 x i32> @test_x86_sse2_psll_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse2_psll_d(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse2_psll_d:			; AVX-LABEL: test_x86_sse2_psll_d:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpslld %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf2,0xc1]			; AVX-NEXT: vpslld %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf2,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psll_d:			; AVX512VL-LABEL: test_x86_sse2_psll_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpslld %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf2,0xc1]			; AVX512VL-NEXT: vpslld %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf2,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psll.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psll.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psll.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psll.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <2 x i64> @test_x86_sse2_psll_q(<2 x i64> %a0, <2 x i64> %a1) {			define <2 x i64> @test_x86_sse2_psll_q(<2 x i64> %a0, <2 x i64> %a1) {
	; AVX-LABEL: test_x86_sse2_psll_q:			; AVX-LABEL: test_x86_sse2_psll_q:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsllq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf3,0xc1]			; AVX-NEXT: vpsllq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf3,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psll_q:			; AVX512VL-LABEL: test_x86_sse2_psll_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xf3,0xc1]			; AVX512VL-NEXT: vpsllq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf3,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64>, <2 x i64>) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64>, <2 x i64>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psll_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psll_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_psll_w:			; AVX-LABEL: test_x86_sse2_psll_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsllw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf1,0xc1]			; AVX-NEXT: vpsllw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf1,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psll_w:			; AVX512VL-LABEL: test_x86_sse2_psll_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf1,0xc1]			; AVX512VL-NEXT: vpsllw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf1,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psll.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psll.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psll.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psll.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_sse2_pslli_d(<4 x i32> %a0) {			define <4 x i32> @test_x86_sse2_pslli_d(<4 x i32> %a0) {
	; AVX-LABEL: test_x86_sse2_pslli_d:			; AVX-LABEL: test_x86_sse2_pslli_d:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpslld $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xf0,0x07]			; AVX-NEXT: vpslld $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xf0,0x07]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pslli_d:			; AVX512VL-LABEL: test_x86_sse2_pslli_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpslld $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xf0,0x07]			; AVX512VL-NEXT: vpslld $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x72,0xf0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.pslli.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.pslli.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.pslli.d(<4 x i32>, i32) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.pslli.d(<4 x i32>, i32) nounwind readnone


	define <2 x i64> @test_x86_sse2_pslli_q(<2 x i64> %a0) {			define <2 x i64> @test_x86_sse2_pslli_q(<2 x i64> %a0) {
	; AVX-LABEL: test_x86_sse2_pslli_q:			; AVX-LABEL: test_x86_sse2_pslli_q:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsllq $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x73,0xf0,0x07]			; AVX-NEXT: vpsllq $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x73,0xf0,0x07]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pslli_q:			; AVX512VL-LABEL: test_x86_sse2_pslli_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllq $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x73,0xf0,0x07]			; AVX512VL-NEXT: vpsllq $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x73,0xf0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.pslli.q(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.pslli.q(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.pslli.q(<2 x i64>, i32) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.pslli.q(<2 x i64>, i32) nounwind readnone


	define <8 x i16> @test_x86_sse2_pslli_w(<8 x i16> %a0) {			define <8 x i16> @test_x86_sse2_pslli_w(<8 x i16> %a0) {
	; AVX-LABEL: test_x86_sse2_pslli_w:			; AVX-LABEL: test_x86_sse2_pslli_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsllw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xf0,0x07]			; AVX-NEXT: vpsllw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xf0,0x07]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_pslli_w:			; AVX512VL-LABEL: test_x86_sse2_pslli_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllw $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x71,0xf0,0x07]			; AVX512VL-NEXT: vpsllw $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x71,0xf0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pslli.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pslli.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pslli.w(<8 x i16>, i32) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pslli.w(<8 x i16>, i32) nounwind readnone


	define <4 x i32> @test_x86_sse2_psra_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse2_psra_d(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse2_psra_d:			; AVX-LABEL: test_x86_sse2_psra_d:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsrad %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe2,0xc1]			; AVX-NEXT: vpsrad %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe2,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psra_d:			; AVX512VL-LABEL: test_x86_sse2_psra_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrad %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe2,0xc1]			; AVX512VL-NEXT: vpsrad %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe2,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psra_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psra_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_psra_w:			; AVX-LABEL: test_x86_sse2_psra_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsraw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe1,0xc1]			; AVX-NEXT: vpsraw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe1,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psra_w:			; AVX512VL-LABEL: test_x86_sse2_psra_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsraw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe1,0xc1]			; AVX512VL-NEXT: vpsraw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe1,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psra.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psra.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psra.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psra.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_sse2_psrai_d(<4 x i32> %a0) {			define <4 x i32> @test_x86_sse2_psrai_d(<4 x i32> %a0) {
	; AVX-LABEL: test_x86_sse2_psrai_d:			; AVX-LABEL: test_x86_sse2_psrai_d:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsrad $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xe0,0x07]			; AVX-NEXT: vpsrad $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xe0,0x07]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psrai_d:			; AVX512VL-LABEL: test_x86_sse2_psrai_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrad $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xe0,0x07]			; AVX512VL-NEXT: vpsrad $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x72,0xe0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psrai.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psrai.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psrai.d(<4 x i32>, i32) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psrai.d(<4 x i32>, i32) nounwind readnone


	define <8 x i16> @test_x86_sse2_psrai_w(<8 x i16> %a0) {			define <8 x i16> @test_x86_sse2_psrai_w(<8 x i16> %a0) {
	; AVX-LABEL: test_x86_sse2_psrai_w:			; AVX-LABEL: test_x86_sse2_psrai_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsraw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xe0,0x07]			; AVX-NEXT: vpsraw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xe0,0x07]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psrai_w:			; AVX512VL-LABEL: test_x86_sse2_psrai_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsraw $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x71,0xe0,0x07]			; AVX512VL-NEXT: vpsraw $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x71,0xe0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psrai.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psrai.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psrai.w(<8 x i16>, i32) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psrai.w(<8 x i16>, i32) nounwind readnone


	define <4 x i32> @test_x86_sse2_psrl_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse2_psrl_d(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse2_psrl_d:			; AVX-LABEL: test_x86_sse2_psrl_d:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsrld %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd2,0xc1]			; AVX-NEXT: vpsrld %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd2,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psrl_d:			; AVX512VL-LABEL: test_x86_sse2_psrl_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrld %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd2,0xc1]			; AVX512VL-NEXT: vpsrld %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd2,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <2 x i64> @test_x86_sse2_psrl_q(<2 x i64> %a0, <2 x i64> %a1) {			define <2 x i64> @test_x86_sse2_psrl_q(<2 x i64> %a0, <2 x i64> %a1) {
	; AVX-LABEL: test_x86_sse2_psrl_q:			; AVX-LABEL: test_x86_sse2_psrl_q:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd3,0xc1]			; AVX-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd3,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psrl_q:			; AVX512VL-LABEL: test_x86_sse2_psrl_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd3,0xc1]			; AVX512VL-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd3,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.psrl.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.psrl.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.psrl.q(<2 x i64>, <2 x i64>) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.psrl.q(<2 x i64>, <2 x i64>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psrl_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psrl_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_psrl_w:			; AVX-LABEL: test_x86_sse2_psrl_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd1,0xc1]			; AVX-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd1,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psrl_w:			; AVX512VL-LABEL: test_x86_sse2_psrl_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd1,0xc1]			; AVX512VL-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd1,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psrl.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psrl.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psrl.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psrl.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_sse2_psrli_d(<4 x i32> %a0) {			define <4 x i32> @test_x86_sse2_psrli_d(<4 x i32> %a0) {
	; AVX-LABEL: test_x86_sse2_psrli_d:			; AVX-LABEL: test_x86_sse2_psrli_d:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsrld $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xd0,0x07]			; AVX-NEXT: vpsrld $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xd0,0x07]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psrli_d:			; AVX512VL-LABEL: test_x86_sse2_psrli_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrld $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xd0,0x07]			; AVX512VL-NEXT: vpsrld $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x72,0xd0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psrli.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psrli.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psrli.d(<4 x i32>, i32) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psrli.d(<4 x i32>, i32) nounwind readnone


	define <2 x i64> @test_x86_sse2_psrli_q(<2 x i64> %a0) {			define <2 x i64> @test_x86_sse2_psrli_q(<2 x i64> %a0) {
	; AVX-LABEL: test_x86_sse2_psrli_q:			; AVX-LABEL: test_x86_sse2_psrli_q:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsrlq $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x73,0xd0,0x07]			; AVX-NEXT: vpsrlq $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x73,0xd0,0x07]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psrli_q:			; AVX512VL-LABEL: test_x86_sse2_psrli_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlq $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x73,0xd0,0x07]			; AVX512VL-NEXT: vpsrlq $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x73,0xd0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.psrli.q(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.psrli.q(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.psrli.q(<2 x i64>, i32) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.psrli.q(<2 x i64>, i32) nounwind readnone


	define <8 x i16> @test_x86_sse2_psrli_w(<8 x i16> %a0) {			define <8 x i16> @test_x86_sse2_psrli_w(<8 x i16> %a0) {
	; AVX-LABEL: test_x86_sse2_psrli_w:			; AVX-LABEL: test_x86_sse2_psrli_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsrlw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xd0,0x07]			; AVX-NEXT: vpsrlw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xd0,0x07]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psrli_w:			; AVX512VL-LABEL: test_x86_sse2_psrli_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlw $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x71,0xd0,0x07]			; AVX512VL-NEXT: vpsrlw $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x71,0xd0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psrli.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psrli.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psrli.w(<8 x i16>, i32) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psrli.w(<8 x i16>, i32) nounwind readnone


	define <16 x i8> @test_x86_sse2_psubs_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_psubs_b(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse2_psubs_b:			; AVX-LABEL: test_x86_sse2_psubs_b:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe8,0xc1]			; AVX-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe8,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psubs_b:			; AVX512VL-LABEL: test_x86_sse2_psubs_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe8,0xc1]			; AVX512VL-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe8,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.psubs.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.psubs.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.psubs.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.psubs.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psubs_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psubs_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_psubs_w:			; AVX-LABEL: test_x86_sse2_psubs_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe9,0xc1]			; AVX-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe9,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psubs_w:			; AVX512VL-LABEL: test_x86_sse2_psubs_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe9,0xc1]			; AVX512VL-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe9,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psubs.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psubs.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psubs.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psubs.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_psubus_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_psubus_b(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse2_psubus_b:			; AVX-LABEL: test_x86_sse2_psubus_b:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd8,0xc1]			; AVX-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd8,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psubus_b:			; AVX512VL-LABEL: test_x86_sse2_psubus_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd8,0xc1]			; AVX512VL-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd8,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.psubus.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.psubus.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.psubus.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.psubus.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psubus_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psubus_w(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse2_psubus_w:			; AVX-LABEL: test_x86_sse2_psubus_w:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd9,0xc1]			; AVX-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd9,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_psubus_w:			; AVX512VL-LABEL: test_x86_sse2_psubus_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd9,0xc1]			; AVX512VL-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd9,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psubus.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psubus.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psubus.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psubus.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <2 x double> @test_x86_sse2_sqrt_pd(<2 x double> %a0) {			define <2 x double> @test_x86_sse2_sqrt_pd(<2 x double> %a0) {
	Show All 25 Lines
	; AVX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_ucomieq_sd:			; AVX512VL-LABEL: test_x86_sse2_ucomieq_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc1]			; AVX512VL-NEXT: vucomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX512VL-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX512VL-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX512VL-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX512VL-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX512VL-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX512VL-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomieq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomieq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomieq.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomieq.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_ucomige_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_ucomige_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_ucomige_sd:			; AVX-LABEL: test_x86_sse2_ucomige_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]			; AVX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_ucomige_sd:			; AVX512VL-LABEL: test_x86_sse2_ucomige_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc1]			; AVX512VL-NEXT: vucomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_ucomigt_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_ucomigt_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_ucomigt_sd:			; AVX-LABEL: test_x86_sse2_ucomigt_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]			; AVX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_ucomigt_sd:			; AVX512VL-LABEL: test_x86_sse2_ucomigt_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc1]			; AVX512VL-NEXT: vucomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_ucomile_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_ucomile_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_ucomile_sd:			; AVX-LABEL: test_x86_sse2_ucomile_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2e,0xc8]			; AVX-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2e,0xc8]
	; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_ucomile_sd:			; AVX512VL-LABEL: test_x86_sse2_ucomile_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc8]			; AVX512VL-NEXT: vucomisd %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc8]
	; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_ucomilt_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_ucomilt_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_ucomilt_sd:			; AVX-LABEL: test_x86_sse2_ucomilt_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2e,0xc8]			; AVX-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2e,0xc8]
	; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_ucomilt_sd:			; AVX512VL-LABEL: test_x86_sse2_ucomilt_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc8]			; AVX512VL-NEXT: vucomisd %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc8]
	; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone


	define i32 @test_x86_sse2_ucomineq_sd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_sse2_ucomineq_sd(<2 x double> %a0, <2 x double> %a1) {
	; AVX-LABEL: test_x86_sse2_ucomineq_sd:			; AVX-LABEL: test_x86_sse2_ucomineq_sd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]			; AVX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse2_ucomineq_sd:			; AVX512VL-LABEL: test_x86_sse2_ucomineq_sd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc1]			; AVX512VL-NEXT: vucomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX512VL-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX512VL-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX512VL-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX512VL-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX512VL-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX512VL-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomineq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomineq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	▲ Show 20 Lines • Show All 155 Lines • ▼ Show 20 Lines
	define <8 x i16> @test_x86_sse41_packusdw(<4 x i32> %a0, <4 x i32> %a1) {			define <8 x i16> @test_x86_sse41_packusdw(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse41_packusdw:			; AVX-LABEL: test_x86_sse41_packusdw:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x2b,0xc1]			; AVX-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x2b,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_packusdw:			; AVX512VL-LABEL: test_x86_sse41_packusdw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x2b,0xc1]			; AVX512VL-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x2b,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse41.packusdw(<4 x i32> %a0, <4 x i32> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse41.packusdw(<4 x i32> %a0, <4 x i32> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse41.packusdw(<4 x i32>, <4 x i32>) nounwind readnone			declare <8 x i16> @llvm.x86.sse41.packusdw(<4 x i32>, <4 x i32>) nounwind readnone


	define <16 x i8> @test_x86_sse41_pblendvb(<16 x i8> %a0, <16 x i8> %a1, <16 x i8> %a2) {			define <16 x i8> @test_x86_sse41_pblendvb(<16 x i8> %a0, <16 x i8> %a1, <16 x i8> %a2) {
	Show All 21 Lines
	define <16 x i8> @test_x86_sse41_pmaxsb(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse41_pmaxsb(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse41_pmaxsb:			; AVX-LABEL: test_x86_sse41_pmaxsb:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3c,0xc1]			; AVX-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3c,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pmaxsb:			; AVX512VL-LABEL: test_x86_sse41_pmaxsb:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3c,0xc1]			; AVX512VL-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3c,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8>, <16 x i8>) nounwind readnone


	define <4 x i32> @test_x86_sse41_pmaxsd(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse41_pmaxsd(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse41_pmaxsd:			; AVX-LABEL: test_x86_sse41_pmaxsd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3d,0xc1]			; AVX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3d,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pmaxsd:			; AVX512VL-LABEL: test_x86_sse41_pmaxsd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3d,0xc1]			; AVX512VL-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3d,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse41.pmaxsd(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse41.pmaxsd(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse41.pmaxsd(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse41.pmaxsd(<4 x i32>, <4 x i32>) nounwind readnone


	define <4 x i32> @test_x86_sse41_pmaxud(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse41_pmaxud(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse41_pmaxud:			; AVX-LABEL: test_x86_sse41_pmaxud:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3f,0xc1]			; AVX-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3f,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pmaxud:			; AVX512VL-LABEL: test_x86_sse41_pmaxud:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3f,0xc1]			; AVX512VL-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3f,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse41.pmaxud(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse41.pmaxud(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse41.pmaxud(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse41.pmaxud(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i16> @test_x86_sse41_pmaxuw(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse41_pmaxuw(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse41_pmaxuw:			; AVX-LABEL: test_x86_sse41_pmaxuw:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmaxuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3e,0xc1]			; AVX-NEXT: vpmaxuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3e,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pmaxuw:			; AVX512VL-LABEL: test_x86_sse41_pmaxuw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxuw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3e,0xc1]			; AVX512VL-NEXT: vpmaxuw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3e,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse41.pmaxuw(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse41.pmaxuw(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse41.pmaxuw(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse41.pmaxuw(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse41_pminsb(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse41_pminsb(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_sse41_pminsb:			; AVX-LABEL: test_x86_sse41_pminsb:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpminsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x38,0xc1]			; AVX-NEXT: vpminsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x38,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pminsb:			; AVX512VL-LABEL: test_x86_sse41_pminsb:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x38,0xc1]			; AVX512VL-NEXT: vpminsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x38,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse41.pminsb(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse41.pminsb(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse41.pminsb(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse41.pminsb(<16 x i8>, <16 x i8>) nounwind readnone


	define <4 x i32> @test_x86_sse41_pminsd(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse41_pminsd(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse41_pminsd:			; AVX-LABEL: test_x86_sse41_pminsd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x39,0xc1]			; AVX-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x39,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pminsd:			; AVX512VL-LABEL: test_x86_sse41_pminsd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x39,0xc1]			; AVX512VL-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x39,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse41.pminsd(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse41.pminsd(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse41.pminsd(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse41.pminsd(<4 x i32>, <4 x i32>) nounwind readnone


	define <4 x i32> @test_x86_sse41_pminud(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse41_pminud(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse41_pminud:			; AVX-LABEL: test_x86_sse41_pminud:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpminud %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3b,0xc1]			; AVX-NEXT: vpminud %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3b,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pminud:			; AVX512VL-LABEL: test_x86_sse41_pminud:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminud %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3b,0xc1]			; AVX512VL-NEXT: vpminud %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3b,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse41.pminud(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse41.pminud(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse41.pminud(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse41.pminud(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i16> @test_x86_sse41_pminuw(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse41_pminuw(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_sse41_pminuw:			; AVX-LABEL: test_x86_sse41_pminuw:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpminuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3a,0xc1]			; AVX-NEXT: vpminuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3a,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pminuw:			; AVX512VL-LABEL: test_x86_sse41_pminuw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminuw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3a,0xc1]			; AVX512VL-NEXT: vpminuw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3a,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse41.pminuw(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse41.pminuw(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse41.pminuw(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse41.pminuw(<8 x i16>, <8 x i16>) nounwind readnone


	define <2 x i64> @test_x86_sse41_pmuldq(<4 x i32> %a0, <4 x i32> %a1) {			define <2 x i64> @test_x86_sse41_pmuldq(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_sse41_pmuldq:			; AVX-LABEL: test_x86_sse41_pmuldq:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x28,0xc1]			; AVX-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x28,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse41_pmuldq:			; AVX512VL-LABEL: test_x86_sse41_pmuldq:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x28,0xc1]			; AVX512VL-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x28,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse41.pmuldq(<4 x i32> %a0, <4 x i32> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse41.pmuldq(<4 x i32> %a0, <4 x i32> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse41.pmuldq(<4 x i32>, <4 x i32>) nounwind readnone			declare <2 x i64> @llvm.x86.sse41.pmuldq(<4 x i32>, <4 x i32>) nounwind readnone


	define i32 @test_x86_sse41_ptestc(<2 x i64> %a0, <2 x i64> %a1) {			define i32 @test_x86_sse41_ptestc(<2 x i64> %a0, <2 x i64> %a1) {
	▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines
	; AVX-NEXT: vpcmpestri $7, (%ecx), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x61,0x01,0x07]			; AVX-NEXT: vpcmpestri $7, (%ecx), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x61,0x01,0x07]
	; AVX-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]			; AVX-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse42_pcmpestri128_load:			; AVX512VL-LABEL: test_x86_sse42_pcmpestri128_load:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x08]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x08]
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vmovdqu8 (%eax), %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x6f,0x00]			; AVX512VL-NEXT: vmovdqu (%eax), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x00]
	; AVX512VL-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]			; AVX512VL-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]
	; AVX512VL-NEXT: movl $7, %edx ## encoding: [0xba,0x07,0x00,0x00,0x00]			; AVX512VL-NEXT: movl $7, %edx ## encoding: [0xba,0x07,0x00,0x00,0x00]
	; AVX512VL-NEXT: vpcmpestri $7, (%ecx), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x61,0x01,0x07]			; AVX512VL-NEXT: vpcmpestri $7, (%ecx), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x61,0x01,0x07]
	; AVX512VL-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]			; AVX512VL-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%1 = load <16 x i8>, <16 x i8>* %a0			%1 = load <16 x i8>, <16 x i8>* %a0
	%2 = load <16 x i8>, <16 x i8>* %a2			%2 = load <16 x i8>, <16 x i8>* %a2
	%res = call i32 @llvm.x86.sse42.pcmpestri128(<16 x i8> %1, i32 7, <16 x i8> %2, i32 7, i8 7) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse42.pcmpestri128(<16 x i8> %1, i32 7, <16 x i8> %2, i32 7, i8 7) ; <i32> [#uses=1]
	▲ Show 20 Lines • Show All 136 Lines • ▼ Show 20 Lines
	; AVX-NEXT: vpcmpistri $7, (%eax), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x63,0x00,0x07]			; AVX-NEXT: vpcmpistri $7, (%eax), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x63,0x00,0x07]
	; AVX-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]			; AVX-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse42_pcmpistri128_load:			; AVX512VL-LABEL: test_x86_sse42_pcmpistri128_load:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08]
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x04]
	; AVX512VL-NEXT: vmovdqu8 (%ecx), %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x6f,0x01]			; AVX512VL-NEXT: vmovdqu (%ecx), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x01]
	; AVX512VL-NEXT: vpcmpistri $7, (%eax), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x63,0x00,0x07]			; AVX512VL-NEXT: vpcmpistri $7, (%eax), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x63,0x00,0x07]
	; AVX512VL-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]			; AVX512VL-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%1 = load <16 x i8>, <16 x i8>* %a0			%1 = load <16 x i8>, <16 x i8>* %a0
	%2 = load <16 x i8>, <16 x i8>* %a1			%2 = load <16 x i8>, <16 x i8>* %a1
	%res = call i32 @llvm.x86.sse42.pcmpistri128(<16 x i8> %1, <16 x i8> %2, i8 7) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse42.pcmpistri128(<16 x i8> %1, <16 x i8> %2, i8 7) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	; AVX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_comieq_ss:			; AVX512VL-LABEL: test_x86_sse_comieq_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc1]			; AVX512VL-NEXT: vcomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX512VL-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX512VL-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX512VL-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX512VL-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX512VL-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX512VL-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comieq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comieq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comieq.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comieq.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_comige_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_comige_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_comige_ss:			; AVX-LABEL: test_x86_sse_comige_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]			; AVX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_comige_ss:			; AVX512VL-LABEL: test_x86_sse_comige_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc1]			; AVX512VL-NEXT: vcomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_comigt_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_comigt_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_comigt_ss:			; AVX-LABEL: test_x86_sse_comigt_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]			; AVX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_comigt_ss:			; AVX512VL-LABEL: test_x86_sse_comigt_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc1]			; AVX512VL-NEXT: vcomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_comile_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_comile_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_comile_ss:			; AVX-LABEL: test_x86_sse_comile_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2f,0xc8]			; AVX-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2f,0xc8]
	; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_comile_ss:			; AVX512VL-LABEL: test_x86_sse_comile_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc8]			; AVX512VL-NEXT: vcomiss %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc8]
	; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_comilt_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_comilt_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_comilt_ss:			; AVX-LABEL: test_x86_sse_comilt_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2f,0xc8]			; AVX-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2f,0xc8]
	; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_comilt_ss:			; AVX512VL-LABEL: test_x86_sse_comilt_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc8]			; AVX512VL-NEXT: vcomiss %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc8]
	; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_comineq_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_comineq_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_comineq_ss:			; AVX-LABEL: test_x86_sse_comineq_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]			; AVX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_comineq_ss:			; AVX512VL-LABEL: test_x86_sse_comineq_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc1]			; AVX512VL-NEXT: vcomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX512VL-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX512VL-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX512VL-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX512VL-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX512VL-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX512VL-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comineq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comineq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comineq.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comineq.ss(<4 x float>, <4 x float>) nounwind readnone


	define <4 x float> @test_x86_sse_cvtsi2ss(<4 x float> %a0) {			define <4 x float> @test_x86_sse_cvtsi2ss(<4 x float> %a0) {
	; AVX-LABEL: test_x86_sse_cvtsi2ss:			; AVX-LABEL: test_x86_sse_cvtsi2ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]			; AVX-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]
	; AVX-NEXT: vcvtsi2ssl %eax, %xmm0, %xmm0 ## encoding: [0xc5,0xfa,0x2a,0xc0]			; AVX-NEXT: vcvtsi2ssl %eax, %xmm0, %xmm0 ## encoding: [0xc5,0xfa,0x2a,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_cvtsi2ss:			; AVX512VL-LABEL: test_x86_sse_cvtsi2ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]			; AVX512VL-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]
	; AVX512VL-NEXT: vcvtsi2ssl %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x2a,0xc0]			; AVX512VL-NEXT: vcvtsi2ssl %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x2a,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse.cvtsi2ss(<4 x float> %a0, i32 7) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse.cvtsi2ss(<4 x float> %a0, i32 7) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse.cvtsi2ss(<4 x float>, i32) nounwind readnone			declare <4 x float> @llvm.x86.sse.cvtsi2ss(<4 x float>, i32) nounwind readnone


	define i32 @test_x86_sse_cvtss2si(<4 x float> %a0) {			define i32 @test_x86_sse_cvtss2si(<4 x float> %a0) {
	; AVX-LABEL: test_x86_sse_cvtss2si:			; AVX-LABEL: test_x86_sse_cvtss2si:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtss2si %xmm0, %eax ## encoding: [0xc5,0xfa,0x2d,0xc0]			; AVX-NEXT: vcvtss2si %xmm0, %eax ## encoding: [0xc5,0xfa,0x2d,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_cvtss2si:			; AVX512VL-LABEL: test_x86_sse_cvtss2si:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtss2si %xmm0, %eax ## encoding: [0x62,0xf1,0x7e,0x08,0x2d,0xc0]			; AVX512VL-NEXT: vcvtss2si %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x2d,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.cvtss2si(<4 x float> %a0) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.cvtss2si(<4 x float> %a0) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.cvtss2si(<4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.cvtss2si(<4 x float>) nounwind readnone


	define i32 @test_x86_sse_cvttss2si(<4 x float> %a0) {			define i32 @test_x86_sse_cvttss2si(<4 x float> %a0) {
	; AVX-LABEL: test_x86_sse_cvttss2si:			; AVX-LABEL: test_x86_sse_cvttss2si:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvttss2si %xmm0, %eax ## encoding: [0xc5,0xfa,0x2c,0xc0]			; AVX-NEXT: vcvttss2si %xmm0, %eax ## encoding: [0xc5,0xfa,0x2c,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_cvttss2si:			; AVX512VL-LABEL: test_x86_sse_cvttss2si:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvttss2si %xmm0, %eax ## encoding: [0x62,0xf1,0x7e,0x08,0x2c,0xc0]			; AVX512VL-NEXT: vcvttss2si %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x2c,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.cvttss2si(<4 x float> %a0) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.cvttss2si(<4 x float> %a0) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.cvttss2si(<4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.cvttss2si(<4 x float>) nounwind readnone


	define void @test_x86_sse_ldmxcsr(i8* %a0) {			define void @test_x86_sse_ldmxcsr(i8* %a0) {
	Show All 12 Lines
	define <4 x float> @test_x86_sse_max_ps(<4 x float> %a0, <4 x float> %a1) {			define <4 x float> @test_x86_sse_max_ps(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_max_ps:			; AVX-LABEL: test_x86_sse_max_ps:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5f,0xc1]			; AVX-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5f,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_max_ps:			; AVX512VL-LABEL: test_x86_sse_max_ps:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5f,0xc1]			; AVX512VL-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5f,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse.max.ps(<4 x float> %a0, <4 x float> %a1) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse.max.ps(<4 x float> %a0, <4 x float> %a1) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse.max.ps(<4 x float>, <4 x float>) nounwind readnone			declare <4 x float> @llvm.x86.sse.max.ps(<4 x float>, <4 x float>) nounwind readnone


	define <4 x float> @test_x86_sse_max_ss(<4 x float> %a0, <4 x float> %a1) {			define <4 x float> @test_x86_sse_max_ss(<4 x float> %a0, <4 x float> %a1) {
	Show All 10 Lines
	define <4 x float> @test_x86_sse_min_ps(<4 x float> %a0, <4 x float> %a1) {			define <4 x float> @test_x86_sse_min_ps(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_min_ps:			; AVX-LABEL: test_x86_sse_min_ps:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vminps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5d,0xc1]			; AVX-NEXT: vminps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5d,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_min_ps:			; AVX512VL-LABEL: test_x86_sse_min_ps:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vminps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5d,0xc1]			; AVX512VL-NEXT: vminps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5d,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse.min.ps(<4 x float> %a0, <4 x float> %a1) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse.min.ps(<4 x float> %a0, <4 x float> %a1) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse.min.ps(<4 x float>, <4 x float>) nounwind readnone			declare <4 x float> @llvm.x86.sse.min.ps(<4 x float>, <4 x float>) nounwind readnone


	define <4 x float> @test_x86_sse_min_ss(<4 x float> %a0, <4 x float> %a1) {			define <4 x float> @test_x86_sse_min_ss(<4 x float> %a0, <4 x float> %a1) {
	▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines
	; AVX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_ucomieq_ss:			; AVX512VL-LABEL: test_x86_sse_ucomieq_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc1]			; AVX512VL-NEXT: vucomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX512VL-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX512VL-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX512VL-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX512VL-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX512VL-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX512VL-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomieq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomieq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomieq.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomieq.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_ucomige_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_ucomige_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_ucomige_ss:			; AVX-LABEL: test_x86_sse_ucomige_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]			; AVX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_ucomige_ss:			; AVX512VL-LABEL: test_x86_sse_ucomige_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc1]			; AVX512VL-NEXT: vucomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_ucomigt_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_ucomigt_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_ucomigt_ss:			; AVX-LABEL: test_x86_sse_ucomigt_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]			; AVX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_ucomigt_ss:			; AVX512VL-LABEL: test_x86_sse_ucomigt_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc1]			; AVX512VL-NEXT: vucomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_ucomile_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_ucomile_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_ucomile_ss:			; AVX-LABEL: test_x86_sse_ucomile_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2e,0xc8]			; AVX-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2e,0xc8]
	; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_ucomile_ss:			; AVX512VL-LABEL: test_x86_sse_ucomile_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc8]			; AVX512VL-NEXT: vucomiss %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc8]
	; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX512VL-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_ucomilt_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_ucomilt_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_ucomilt_ss:			; AVX-LABEL: test_x86_sse_ucomilt_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2e,0xc8]			; AVX-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2e,0xc8]
	; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_ucomilt_ss:			; AVX512VL-LABEL: test_x86_sse_ucomilt_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX512VL-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX512VL-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc8]			; AVX512VL-NEXT: vucomiss %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc8]
	; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX512VL-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone


	define i32 @test_x86_sse_ucomineq_ss(<4 x float> %a0, <4 x float> %a1) {			define i32 @test_x86_sse_ucomineq_ss(<4 x float> %a0, <4 x float> %a1) {
	; AVX-LABEL: test_x86_sse_ucomineq_ss:			; AVX-LABEL: test_x86_sse_ucomineq_ss:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]			; AVX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_sse_ucomineq_ss:			; AVX512VL-LABEL: test_x86_sse_ucomineq_ss:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc1]			; AVX512VL-NEXT: vucomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX512VL-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX512VL-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX512VL-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX512VL-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX512VL-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX512VL-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX512VL-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomineq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomineq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomineq.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomineq.ss(<4 x float>, <4 x float>) nounwind readnone


	define <16 x i8> @test_x86_ssse3_pabs_b_128(<16 x i8> %a0) {			define <16 x i8> @test_x86_ssse3_pabs_b_128(<16 x i8> %a0) {
	; AVX-LABEL: test_x86_ssse3_pabs_b_128:			; AVX-LABEL: test_x86_ssse3_pabs_b_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpabsb %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1c,0xc0]			; AVX-NEXT: vpabsb %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1c,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_ssse3_pabs_b_128:			; AVX512VL-LABEL: test_x86_ssse3_pabs_b_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpabsb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1c,0xc0]			; AVX512VL-NEXT: vpabsb %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1c,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.ssse3.pabs.b.128(<16 x i8> %a0) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.ssse3.pabs.b.128(<16 x i8> %a0) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.ssse3.pabs.b.128(<16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.ssse3.pabs.b.128(<16 x i8>) nounwind readnone


	define <4 x i32> @test_x86_ssse3_pabs_d_128(<4 x i32> %a0) {			define <4 x i32> @test_x86_ssse3_pabs_d_128(<4 x i32> %a0) {
	; AVX-LABEL: test_x86_ssse3_pabs_d_128:			; AVX-LABEL: test_x86_ssse3_pabs_d_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpabsd %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1e,0xc0]			; AVX-NEXT: vpabsd %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1e,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_ssse3_pabs_d_128:			; AVX512VL-LABEL: test_x86_ssse3_pabs_d_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpabsd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1e,0xc0]			; AVX512VL-NEXT: vpabsd %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1e,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32>) nounwind readnone


	define <8 x i16> @test_x86_ssse3_pabs_w_128(<8 x i16> %a0) {			define <8 x i16> @test_x86_ssse3_pabs_w_128(<8 x i16> %a0) {
	; AVX-LABEL: test_x86_ssse3_pabs_w_128:			; AVX-LABEL: test_x86_ssse3_pabs_w_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpabsw %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1d,0xc0]			; AVX-NEXT: vpabsw %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1d,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_ssse3_pabs_w_128:			; AVX512VL-LABEL: test_x86_ssse3_pabs_w_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpabsw %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1d,0xc0]			; AVX512VL-NEXT: vpabsw %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1d,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16> %a0) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16> %a0) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_ssse3_phadd_d_128(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_ssse3_phadd_d_128(<4 x i32> %a0, <4 x i32> %a1) {
	▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
	define <8 x i16> @test_x86_ssse3_pmadd_ub_sw_128(<16 x i8> %a0, <16 x i8> %a1) {			define <8 x i16> @test_x86_ssse3_pmadd_ub_sw_128(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_ssse3_pmadd_ub_sw_128:			; AVX-LABEL: test_x86_ssse3_pmadd_ub_sw_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x04,0xc1]			; AVX-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x04,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_ssse3_pmadd_ub_sw_128:			; AVX512VL-LABEL: test_x86_ssse3_pmadd_ub_sw_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x04,0xc1]			; AVX512VL-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x04,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8> %a0, <16 x i8> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8> %a0, <16 x i8> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>) nounwind readnone			declare <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>) nounwind readnone


	; Make sure we don't commute this operation.			; Make sure we don't commute this operation.
	define <8 x i16> @test_x86_ssse3_pmadd_ub_sw_128_load_op0(<16 x i8>* %ptr, <16 x i8> %a1) {			define <8 x i16> @test_x86_ssse3_pmadd_ub_sw_128_load_op0(<16 x i8>* %ptr, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_ssse3_pmadd_ub_sw_128_load_op0:			; AVX-LABEL: test_x86_ssse3_pmadd_ub_sw_128_load_op0:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX-NEXT: vmovdqa (%eax), %xmm1 ## encoding: [0xc5,0xf9,0x6f,0x08]			; AVX-NEXT: vmovdqa (%eax), %xmm1 ## encoding: [0xc5,0xf9,0x6f,0x08]
	; AVX-NEXT: vpmaddubsw %xmm0, %xmm1, %xmm0 ## encoding: [0xc4,0xe2,0x71,0x04,0xc0]			; AVX-NEXT: vpmaddubsw %xmm0, %xmm1, %xmm0 ## encoding: [0xc4,0xe2,0x71,0x04,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_ssse3_pmadd_ub_sw_128_load_op0:			; AVX512VL-LABEL: test_x86_ssse3_pmadd_ub_sw_128_load_op0:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vmovdqu8 (%eax), %xmm1 ## encoding: [0x62,0xf1,0x7f,0x08,0x6f,0x08]			; AVX512VL-NEXT: vmovdqu (%eax), %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x08]
	; AVX512VL-NEXT: vpmaddubsw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf2,0x75,0x08,0x04,0xc0]			; AVX512VL-NEXT: vpmaddubsw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x71,0x04,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%a0 = load <16 x i8>, <16 x i8>* %ptr			%a0 = load <16 x i8>, <16 x i8>* %ptr
	%res = call <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8> %a0, <16 x i8> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8> %a0, <16 x i8> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}


	define <8 x i16> @test_x86_ssse3_pmul_hr_sw_128(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_ssse3_pmul_hr_sw_128(<8 x i16> %a0, <8 x i16> %a1) {
	; AVX-LABEL: test_x86_ssse3_pmul_hr_sw_128:			; AVX-LABEL: test_x86_ssse3_pmul_hr_sw_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0b,0xc1]			; AVX-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0b,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_ssse3_pmul_hr_sw_128:			; AVX512VL-LABEL: test_x86_ssse3_pmul_hr_sw_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x0b,0xc1]			; AVX512VL-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x0b,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.ssse3.pmul.hr.sw.128(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.ssse3.pmul.hr.sw.128(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.ssse3.pmul.hr.sw.128(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.ssse3.pmul.hr.sw.128(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_ssse3_pshuf_b_128(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_ssse3_pshuf_b_128(<16 x i8> %a0, <16 x i8> %a1) {
	; AVX-LABEL: test_x86_ssse3_pshuf_b_128:			; AVX-LABEL: test_x86_ssse3_pshuf_b_128:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpshufb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x00,0xc1]			; AVX-NEXT: vpshufb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x00,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_ssse3_pshuf_b_128:			; AVX512VL-LABEL: test_x86_ssse3_pshuf_b_128:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpshufb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x00,0xc1]			; AVX512VL-NEXT: vpshufb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x00,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>) nounwind readnone


	define <16 x i8> @test_x86_ssse3_psign_b_128(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_ssse3_psign_b_128(<16 x i8> %a0, <16 x i8> %a1) {
	▲ Show 20 Lines • Show All 170 Lines • ▼ Show 20 Lines
	; AVX-LABEL: test_x86_avx_cvt_pd2_ps_256:			; AVX-LABEL: test_x86_avx_cvt_pd2_ps_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtpd2ps %ymm0, %xmm0 ## encoding: [0xc5,0xfd,0x5a,0xc0]			; AVX-NEXT: vcvtpd2ps %ymm0, %xmm0 ## encoding: [0xc5,0xfd,0x5a,0xc0]
	; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]			; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_cvt_pd2_ps_256:			; AVX512VL-LABEL: test_x86_avx_cvt_pd2_ps_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtpd2ps %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x5a,0xc0]			; AVX512VL-NEXT: vcvtpd2ps %ymm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x5a,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx.cvt.pd2.ps.256(<4 x double> %a0) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.avx.cvt.pd2.ps.256(<4 x double> %a0) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.avx.cvt.pd2.ps.256(<4 x double>) nounwind readnone			declare <4 x float> @llvm.x86.avx.cvt.pd2.ps.256(<4 x double>) nounwind readnone


	define <4 x i32> @test_x86_avx_cvt_pd2dq_256(<4 x double> %a0) {			define <4 x i32> @test_x86_avx_cvt_pd2dq_256(<4 x double> %a0) {
	; AVX-LABEL: test_x86_avx_cvt_pd2dq_256:			; AVX-LABEL: test_x86_avx_cvt_pd2dq_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtpd2dq %ymm0, %xmm0 ## encoding: [0xc5,0xff,0xe6,0xc0]			; AVX-NEXT: vcvtpd2dq %ymm0, %xmm0 ## encoding: [0xc5,0xff,0xe6,0xc0]
	; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]			; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_cvt_pd2dq_256:			; AVX512VL-LABEL: test_x86_avx_cvt_pd2dq_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtpd2dq %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x28,0xe6,0xc0]			; AVX512VL-NEXT: vcvtpd2dq %ymm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xff,0xe6,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx.cvt.pd2dq.256(<4 x double> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.avx.cvt.pd2dq.256(<4 x double> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.avx.cvt.pd2dq.256(<4 x double>) nounwind readnone			declare <4 x i32> @llvm.x86.avx.cvt.pd2dq.256(<4 x double>) nounwind readnone


	define <8 x i32> @test_x86_avx_cvt_ps2dq_256(<8 x float> %a0) {			define <8 x i32> @test_x86_avx_cvt_ps2dq_256(<8 x float> %a0) {
	Show All 10 Lines
	define <8 x float> @test_x86_avx_cvtdq2_ps_256(<8 x i32> %a0) {			define <8 x float> @test_x86_avx_cvtdq2_ps_256(<8 x i32> %a0) {
	; AVX-LABEL: test_x86_avx_cvtdq2_ps_256:			; AVX-LABEL: test_x86_avx_cvtdq2_ps_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvtdq2ps %ymm0, %ymm0 ## encoding: [0xc5,0xfc,0x5b,0xc0]			; AVX-NEXT: vcvtdq2ps %ymm0, %ymm0 ## encoding: [0xc5,0xfc,0x5b,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_cvtdq2_ps_256:			; AVX512VL-LABEL: test_x86_avx_cvtdq2_ps_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvtdq2ps %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5b,0xc0]			; AVX512VL-NEXT: vcvtdq2ps %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5b,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x float> @llvm.x86.avx.cvtdq2.ps.256(<8 x i32> %a0) ; <<8 x float>> [#uses=1]			%res = call <8 x float> @llvm.x86.avx.cvtdq2.ps.256(<8 x i32> %a0) ; <<8 x float>> [#uses=1]
	ret <8 x float> %res			ret <8 x float> %res
	}			}
	declare <8 x float> @llvm.x86.avx.cvtdq2.ps.256(<8 x i32>) nounwind readnone			declare <8 x float> @llvm.x86.avx.cvtdq2.ps.256(<8 x i32>) nounwind readnone


	define <4 x i32> @test_x86_avx_cvtt_pd2dq_256(<4 x double> %a0) {			define <4 x i32> @test_x86_avx_cvtt_pd2dq_256(<4 x double> %a0) {
	; AVX-LABEL: test_x86_avx_cvtt_pd2dq_256:			; AVX-LABEL: test_x86_avx_cvtt_pd2dq_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvttpd2dq %ymm0, %xmm0 ## encoding: [0xc5,0xfd,0xe6,0xc0]			; AVX-NEXT: vcvttpd2dq %ymm0, %xmm0 ## encoding: [0xc5,0xfd,0xe6,0xc0]
	; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]			; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_cvtt_pd2dq_256:			; AVX512VL-LABEL: test_x86_avx_cvtt_pd2dq_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvttpd2dq %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xe6,0xc0]			; AVX512VL-NEXT: vcvttpd2dq %ymm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe6,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double>) nounwind readnone			declare <4 x i32> @llvm.x86.avx.cvtt.pd2dq.256(<4 x double>) nounwind readnone


	define <8 x i32> @test_x86_avx_cvtt_ps2dq_256(<8 x float> %a0) {			define <8 x i32> @test_x86_avx_cvtt_ps2dq_256(<8 x float> %a0) {
	; AVX-LABEL: test_x86_avx_cvtt_ps2dq_256:			; AVX-LABEL: test_x86_avx_cvtt_ps2dq_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vcvttps2dq %ymm0, %ymm0 ## encoding: [0xc5,0xfe,0x5b,0xc0]			; AVX-NEXT: vcvttps2dq %ymm0, %ymm0 ## encoding: [0xc5,0xfe,0x5b,0xc0]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_cvtt_ps2dq_256:			; AVX512VL-LABEL: test_x86_avx_cvtt_ps2dq_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vcvttps2dq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7e,0x28,0x5b,0xc0]			; AVX512VL-NEXT: vcvttps2dq %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x5b,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float> %a0) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float> %a0) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float>) nounwind readnone			declare <8 x i32> @llvm.x86.avx.cvtt.ps2dq.256(<8 x float>) nounwind readnone


	define <8 x float> @test_x86_avx_dp_ps_256(<8 x float> %a0, <8 x float> %a1) {			define <8 x float> @test_x86_avx_dp_ps_256(<8 x float> %a0, <8 x float> %a1) {
	▲ Show 20 Lines • Show All 176 Lines • ▼ Show 20 Lines
	define <4 x double> @test_x86_avx_max_pd_256(<4 x double> %a0, <4 x double> %a1) {			define <4 x double> @test_x86_avx_max_pd_256(<4 x double> %a0, <4 x double> %a1) {
	; AVX-LABEL: test_x86_avx_max_pd_256:			; AVX-LABEL: test_x86_avx_max_pd_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vmaxpd %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x5f,0xc1]			; AVX-NEXT: vmaxpd %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x5f,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_max_pd_256:			; AVX512VL-LABEL: test_x86_avx_max_pd_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vmaxpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x5f,0xc1]			; AVX512VL-NEXT: vmaxpd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x5f,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx.max.pd.256(<4 x double> %a0, <4 x double> %a1) ; <<4 x double>> [#uses=1]			%res = call <4 x double> @llvm.x86.avx.max.pd.256(<4 x double> %a0, <4 x double> %a1) ; <<4 x double>> [#uses=1]
	ret <4 x double> %res			ret <4 x double> %res
	}			}
	declare <4 x double> @llvm.x86.avx.max.pd.256(<4 x double>, <4 x double>) nounwind readnone			declare <4 x double> @llvm.x86.avx.max.pd.256(<4 x double>, <4 x double>) nounwind readnone


	define <8 x float> @test_x86_avx_max_ps_256(<8 x float> %a0, <8 x float> %a1) {			define <8 x float> @test_x86_avx_max_ps_256(<8 x float> %a0, <8 x float> %a1) {
	; AVX-LABEL: test_x86_avx_max_ps_256:			; AVX-LABEL: test_x86_avx_max_ps_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vmaxps %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfc,0x5f,0xc1]			; AVX-NEXT: vmaxps %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfc,0x5f,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_max_ps_256:			; AVX512VL-LABEL: test_x86_avx_max_ps_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vmaxps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5f,0xc1]			; AVX512VL-NEXT: vmaxps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5f,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x float> @llvm.x86.avx.max.ps.256(<8 x float> %a0, <8 x float> %a1) ; <<8 x float>> [#uses=1]			%res = call <8 x float> @llvm.x86.avx.max.ps.256(<8 x float> %a0, <8 x float> %a1) ; <<8 x float>> [#uses=1]
	ret <8 x float> %res			ret <8 x float> %res
	}			}
	declare <8 x float> @llvm.x86.avx.max.ps.256(<8 x float>, <8 x float>) nounwind readnone			declare <8 x float> @llvm.x86.avx.max.ps.256(<8 x float>, <8 x float>) nounwind readnone


	define <4 x double> @test_x86_avx_min_pd_256(<4 x double> %a0, <4 x double> %a1) {			define <4 x double> @test_x86_avx_min_pd_256(<4 x double> %a0, <4 x double> %a1) {
	; AVX-LABEL: test_x86_avx_min_pd_256:			; AVX-LABEL: test_x86_avx_min_pd_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vminpd %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x5d,0xc1]			; AVX-NEXT: vminpd %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x5d,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_min_pd_256:			; AVX512VL-LABEL: test_x86_avx_min_pd_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vminpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x5d,0xc1]			; AVX512VL-NEXT: vminpd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x5d,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx.min.pd.256(<4 x double> %a0, <4 x double> %a1) ; <<4 x double>> [#uses=1]			%res = call <4 x double> @llvm.x86.avx.min.pd.256(<4 x double> %a0, <4 x double> %a1) ; <<4 x double>> [#uses=1]
	ret <4 x double> %res			ret <4 x double> %res
	}			}
	declare <4 x double> @llvm.x86.avx.min.pd.256(<4 x double>, <4 x double>) nounwind readnone			declare <4 x double> @llvm.x86.avx.min.pd.256(<4 x double>, <4 x double>) nounwind readnone


	define <8 x float> @test_x86_avx_min_ps_256(<8 x float> %a0, <8 x float> %a1) {			define <8 x float> @test_x86_avx_min_ps_256(<8 x float> %a0, <8 x float> %a1) {
	; AVX-LABEL: test_x86_avx_min_ps_256:			; AVX-LABEL: test_x86_avx_min_ps_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vminps %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfc,0x5d,0xc1]			; AVX-NEXT: vminps %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfc,0x5d,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_min_ps_256:			; AVX512VL-LABEL: test_x86_avx_min_ps_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vminps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5d,0xc1]			; AVX512VL-NEXT: vminps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5d,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x float> @llvm.x86.avx.min.ps.256(<8 x float> %a0, <8 x float> %a1) ; <<8 x float>> [#uses=1]			%res = call <8 x float> @llvm.x86.avx.min.ps.256(<8 x float> %a0, <8 x float> %a1) ; <<8 x float>> [#uses=1]
	ret <8 x float> %res			ret <8 x float> %res
	}			}
	declare <8 x float> @llvm.x86.avx.min.ps.256(<8 x float>, <8 x float>) nounwind readnone			declare <8 x float> @llvm.x86.avx.min.ps.256(<8 x float>, <8 x float>) nounwind readnone


	define i32 @test_x86_avx_movmsk_pd_256(<4 x double> %a0) {			define i32 @test_x86_avx_movmsk_pd_256(<4 x double> %a0) {
	▲ Show 20 Lines • Show All 213 Lines • ▼ Show 20 Lines
	define <2 x double> @test_x86_avx_vpermilvar_pd(<2 x double> %a0, <2 x i64> %a1) {			define <2 x double> @test_x86_avx_vpermilvar_pd(<2 x double> %a0, <2 x i64> %a1) {
	; AVX-LABEL: test_x86_avx_vpermilvar_pd:			; AVX-LABEL: test_x86_avx_vpermilvar_pd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpermilpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0d,0xc1]			; AVX-NEXT: vpermilpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0d,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_vpermilvar_pd:			; AVX512VL-LABEL: test_x86_avx_vpermilvar_pd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpermilpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x0d,0xc1]			; AVX512VL-NEXT: vpermilpd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x0d,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double> %a0, <2 x i64> %a1) ; <<2 x double>> [#uses=1]			%res = call <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double> %a0, <2 x i64> %a1) ; <<2 x double>> [#uses=1]
	ret <2 x double> %res			ret <2 x double> %res
	}			}
	declare <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double>, <2 x i64>) nounwind readnone			declare <2 x double> @llvm.x86.avx.vpermilvar.pd(<2 x double>, <2 x i64>) nounwind readnone


	define <4 x double> @test_x86_avx_vpermilvar_pd_256(<4 x double> %a0, <4 x i64> %a1) {			define <4 x double> @test_x86_avx_vpermilvar_pd_256(<4 x double> %a0, <4 x i64> %a1) {
	; AVX-LABEL: test_x86_avx_vpermilvar_pd_256:			; AVX-LABEL: test_x86_avx_vpermilvar_pd_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpermilpd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x0d,0xc1]			; AVX-NEXT: vpermilpd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x0d,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_vpermilvar_pd_256:			; AVX512VL-LABEL: test_x86_avx_vpermilvar_pd_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpermilpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x0d,0xc1]			; AVX512VL-NEXT: vpermilpd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x0d,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx.vpermilvar.pd.256(<4 x double> %a0, <4 x i64> %a1) ; <<4 x double>> [#uses=1]			%res = call <4 x double> @llvm.x86.avx.vpermilvar.pd.256(<4 x double> %a0, <4 x i64> %a1) ; <<4 x double>> [#uses=1]
	ret <4 x double> %res			ret <4 x double> %res
	}			}
	declare <4 x double> @llvm.x86.avx.vpermilvar.pd.256(<4 x double>, <4 x i64>) nounwind readnone			declare <4 x double> @llvm.x86.avx.vpermilvar.pd.256(<4 x double>, <4 x i64>) nounwind readnone

	define <4 x double> @test_x86_avx_vpermilvar_pd_256_2(<4 x double> %a0) {			define <4 x double> @test_x86_avx_vpermilvar_pd_256_2(<4 x double> %a0) {
	; AVX-LABEL: test_x86_avx_vpermilvar_pd_256_2:			; AVX-LABEL: test_x86_avx_vpermilvar_pd_256_2:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpermilpd $9, %ymm0, %ymm0 ## encoding: [0xc4,0xe3,0x7d,0x05,0xc0,0x09]			; AVX-NEXT: vpermilpd $9, %ymm0, %ymm0 ## encoding: [0xc4,0xe3,0x7d,0x05,0xc0,0x09]
	; AVX-NEXT: ## ymm0 = ymm0[1,0,2,3]			; AVX-NEXT: ## ymm0 = ymm0[1,0,2,3]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_vpermilvar_pd_256_2:			; AVX512VL-LABEL: test_x86_avx_vpermilvar_pd_256_2:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpermilpd $9, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x05,0xc0,0x09]			; AVX512VL-NEXT: vpermilpd $9, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x7d,0x05,0xc0,0x09]
	; AVX512VL-NEXT: ## ymm0 = ymm0[1,0,2,3]			; AVX512VL-NEXT: ## ymm0 = ymm0[1,0,2,3]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx.vpermilvar.pd.256(<4 x double> %a0, <4 x i64> <i64 2, i64 0, i64 0, i64 2>) ; <<4 x double>> [#uses=1]			%res = call <4 x double> @llvm.x86.avx.vpermilvar.pd.256(<4 x double> %a0, <4 x i64> <i64 2, i64 0, i64 0, i64 2>) ; <<4 x double>> [#uses=1]
	ret <4 x double> %res			ret <4 x double> %res
	}			}

	define <4 x float> @test_x86_avx_vpermilvar_ps(<4 x float> %a0, <4 x i32> %a1) {			define <4 x float> @test_x86_avx_vpermilvar_ps(<4 x float> %a0, <4 x i32> %a1) {
	; AVX-LABEL: test_x86_avx_vpermilvar_ps:			; AVX-LABEL: test_x86_avx_vpermilvar_ps:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpermilps %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0c,0xc1]			; AVX-NEXT: vpermilps %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0c,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_vpermilvar_ps:			; AVX512VL-LABEL: test_x86_avx_vpermilvar_ps:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpermilps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x0c,0xc1]			; AVX512VL-NEXT: vpermilps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x0c,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx.vpermilvar.ps(<4 x float> %a0, <4 x i32> %a1) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.avx.vpermilvar.ps(<4 x float> %a0, <4 x i32> %a1) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	define <4 x float> @test_x86_avx_vpermilvar_ps_load(<4 x float> %a0, <4 x i32>* %a1) {			define <4 x float> @test_x86_avx_vpermilvar_ps_load(<4 x float> %a0, <4 x i32>* %a1) {
	; AVX-LABEL: test_x86_avx_vpermilvar_ps_load:			; AVX-LABEL: test_x86_avx_vpermilvar_ps_load:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX-NEXT: vpermilps (%eax), %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0c,0x00]			; AVX-NEXT: vpermilps (%eax), %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0c,0x00]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_vpermilvar_ps_load:			; AVX512VL-LABEL: test_x86_avx_vpermilvar_ps_load:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vpermilps (%eax), %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x0c,0x00]			; AVX512VL-NEXT: vpermilps (%eax), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x0c,0x00]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%a2 = load <4 x i32>, <4 x i32>* %a1			%a2 = load <4 x i32>, <4 x i32>* %a1
	%res = call <4 x float> @llvm.x86.avx.vpermilvar.ps(<4 x float> %a0, <4 x i32> %a2) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.avx.vpermilvar.ps(<4 x float> %a0, <4 x i32> %a2) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.avx.vpermilvar.ps(<4 x float>, <4 x i32>) nounwind readnone			declare <4 x float> @llvm.x86.avx.vpermilvar.ps(<4 x float>, <4 x i32>) nounwind readnone


	define <8 x float> @test_x86_avx_vpermilvar_ps_256(<8 x float> %a0, <8 x i32> %a1) {			define <8 x float> @test_x86_avx_vpermilvar_ps_256(<8 x float> %a0, <8 x i32> %a1) {
	; AVX-LABEL: test_x86_avx_vpermilvar_ps_256:			; AVX-LABEL: test_x86_avx_vpermilvar_ps_256:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: vpermilps %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x0c,0xc1]			; AVX-NEXT: vpermilps %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x0c,0xc1]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx_vpermilvar_ps_256:			; AVX512VL-LABEL: test_x86_avx_vpermilvar_ps_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpermilps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x0c,0xc1]			; AVX512VL-NEXT: vpermilps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x0c,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x float> @llvm.x86.avx.vpermilvar.ps.256(<8 x float> %a0, <8 x i32> %a1) ; <<8 x float>> [#uses=1]			%res = call <8 x float> @llvm.x86.avx.vpermilvar.ps.256(<8 x float> %a0, <8 x i32> %a1) ; <<8 x float>> [#uses=1]
	ret <8 x float> %res			ret <8 x float> %res
	}			}
	declare <8 x float> @llvm.x86.avx.vpermilvar.ps.256(<8 x float>, <8 x i32>) nounwind readnone			declare <8 x float> @llvm.x86.avx.vpermilvar.ps.256(<8 x float>, <8 x i32>) nounwind readnone


	define i32 @test_x86_avx_vtestc_pd(<2 x double> %a0, <2 x double> %a1) {			define i32 @test_x86_avx_vtestc_pd(<2 x double> %a0, <2 x double> %a1) {
	▲ Show 20 Lines • Show All 331 Lines • ▼ Show 20 Lines
	; AVX-NEXT: ## fixup A - offset: 4, value: LCPI247_0, kind: FK_Data_4			; AVX-NEXT: ## fixup A - offset: 4, value: LCPI247_0, kind: FK_Data_4
	; AVX-NEXT: vmovntdq %ymm0, (%eax) ## encoding: [0xc5,0xfd,0xe7,0x00]			; AVX-NEXT: vmovntdq %ymm0, (%eax) ## encoding: [0xc5,0xfd,0xe7,0x00]
	; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]			; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: movnt_dq:			; AVX512VL-LABEL: movnt_dq:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vpaddq LCPI247_0, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0x05,A,A,A,A]			; AVX512VL-NEXT: vpaddq LCPI247_0, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0x05,A,A,A,A]
	; AVX512VL-NEXT: ## fixup A - offset: 6, value: LCPI247_0, kind: FK_Data_4			; AVX512VL-NEXT: ## fixup A - offset: 4, value: LCPI247_0, kind: FK_Data_4
	; AVX512VL-NEXT: vmovntdq %ymm0, (%eax) ## encoding: [0x62,0xf1,0x7d,0x28,0xe7,0x00]			; AVX512VL-NEXT: vmovntdq %ymm0, (%eax) ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe7,0x00]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%a2 = add <2 x i64> %a1, <i64 1, i64 1>			%a2 = add <2 x i64> %a1, <i64 1, i64 1>
	%a3 = shufflevector <2 x i64> %a2, <2 x i64> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>			%a3 = shufflevector <2 x i64> %a2, <2 x i64> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
	tail call void @llvm.x86.avx.movnt.dq.256(i8* %p, <4 x i64> %a3) nounwind			tail call void @llvm.x86.avx.movnt.dq.256(i8* %p, <4 x i64> %a3) nounwind
	ret void			ret void
	}			}
	declare void @llvm.x86.avx.movnt.dq.256(i8*, <4 x i64>) nounwind			declare void @llvm.x86.avx.movnt.dq.256(i8*, <4 x i64>) nounwind

	define void @movnt_ps(i8* %p, <8 x float> %a) nounwind {			define void @movnt_ps(i8* %p, <8 x float> %a) nounwind {
	; AVX-LABEL: movnt_ps:			; AVX-LABEL: movnt_ps:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX-NEXT: vmovntps %ymm0, (%eax) ## encoding: [0xc5,0xfc,0x2b,0x00]			; AVX-NEXT: vmovntps %ymm0, (%eax) ## encoding: [0xc5,0xfc,0x2b,0x00]
	; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]			; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: movnt_ps:			; AVX512VL-LABEL: movnt_ps:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vmovntps %ymm0, (%eax) ## encoding: [0x62,0xf1,0x7c,0x28,0x2b,0x00]			; AVX512VL-NEXT: vmovntps %ymm0, (%eax) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x2b,0x00]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	tail call void @llvm.x86.avx.movnt.ps.256(i8* %p, <8 x float> %a) nounwind			tail call void @llvm.x86.avx.movnt.ps.256(i8* %p, <8 x float> %a) nounwind
	ret void			ret void
	}			}
	declare void @llvm.x86.avx.movnt.ps.256(i8*, <8 x float>) nounwind			declare void @llvm.x86.avx.movnt.ps.256(i8*, <8 x float>) nounwind

	define void @movnt_pd(i8* %p, <4 x double> %a1) nounwind {			define void @movnt_pd(i8* %p, <4 x double> %a1) nounwind {
	; add operation forces the execution domain.			; add operation forces the execution domain.
	; AVX-LABEL: movnt_pd:			; AVX-LABEL: movnt_pd:
	; AVX: ## BB#0:			; AVX: ## BB#0:
	; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX-NEXT: vxorpd %ymm1, %ymm1, %ymm1 ## encoding: [0xc5,0xf5,0x57,0xc9]			; AVX-NEXT: vxorpd %ymm1, %ymm1, %ymm1 ## encoding: [0xc5,0xf5,0x57,0xc9]
	; AVX-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x58,0xc1]			; AVX-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x58,0xc1]
	; AVX-NEXT: vmovntpd %ymm0, (%eax) ## encoding: [0xc5,0xfd,0x2b,0x00]			; AVX-NEXT: vmovntpd %ymm0, (%eax) ## encoding: [0xc5,0xfd,0x2b,0x00]
	; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]			; AVX-NEXT: vzeroupper ## encoding: [0xc5,0xf8,0x77]
	; AVX-NEXT: retl ## encoding: [0xc3]			; AVX-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: movnt_pd:			; AVX512VL-LABEL: movnt_pd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vxorpd %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0xf5,0x28,0x57,0xc9]			; AVX512VL-NEXT: vxorpd %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x57,0xc9]
	; AVX512VL-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x58,0xc1]			; AVX512VL-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x58,0xc1]
	; AVX512VL-NEXT: vmovntpd %ymm0, (%eax) ## encoding: [0x62,0xf1,0xfd,0x28,0x2b,0x00]			; AVX512VL-NEXT: vmovntpd %ymm0, (%eax) ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x2b,0x00]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%a2 = fadd <4 x double> %a1, <double 0x0, double 0x0, double 0x0, double 0x0>			%a2 = fadd <4 x double> %a1, <double 0x0, double 0x0, double 0x0, double 0x0>
	tail call void @llvm.x86.avx.movnt.pd.256(i8* %p, <4 x double> %a2) nounwind			tail call void @llvm.x86.avx.movnt.pd.256(i8* %p, <4 x double> %a2) nounwind
	ret void			ret void
	}			}
	declare void @llvm.x86.avx.movnt.pd.256(i8*, <4 x double>) nounwind			declare void @llvm.x86.avx.movnt.pd.256(i8*, <4 x double>) nounwind


	Show All 10 Lines

llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-apple-darwin -mattr=avx2 -show-mc-encoding \| FileCheck %s --check-prefix=CHECK --check-prefix=AVX2			; RUN: llc < %s -mtriple=i686-apple-darwin -mattr=avx2 -show-mc-encoding \| FileCheck %s --check-prefix=CHECK --check-prefix=AVX2
	; RUN: llc < %s -mtriple=i686-apple-darwin -mcpu=skx -show-mc-encoding \| FileCheck %s --check-prefix=CHECK --check-prefix=AVX512VL			; RUN: llc < %s -mtriple=i686-apple-darwin -mcpu=skx -show-mc-encoding \| FileCheck %s --check-prefix=CHECK --check-prefix=AVX512VL

	define <16 x i16> @test_x86_avx2_packssdw(<8 x i32> %a0, <8 x i32> %a1) {			define <16 x i16> @test_x86_avx2_packssdw(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_packssdw:			; AVX2-LABEL: test_x86_avx2_packssdw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x6b,0xc1]			; AVX2-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x6b,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_packssdw:			; AVX512VL-LABEL: test_x86_avx2_packssdw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x6b,0xc1]			; AVX512VL-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6b,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.packssdw(<8 x i32> %a0, <8 x i32> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.packssdw(<8 x i32> %a0, <8 x i32> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.packssdw(<8 x i32>, <8 x i32>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.packssdw(<8 x i32>, <8 x i32>) nounwind readnone


	define <32 x i8> @test_x86_avx2_packsswb(<16 x i16> %a0, <16 x i16> %a1) {			define <32 x i8> @test_x86_avx2_packsswb(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_packsswb:			; AVX2-LABEL: test_x86_avx2_packsswb:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x63,0xc1]			; AVX2-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x63,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_packsswb:			; AVX512VL-LABEL: test_x86_avx2_packsswb:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x63,0xc1]			; AVX512VL-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x63,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.packsswb(<16 x i16> %a0, <16 x i16> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.packsswb(<16 x i16> %a0, <16 x i16> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.packsswb(<16 x i16>, <16 x i16>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.packsswb(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_packuswb(<16 x i16> %a0, <16 x i16> %a1) {			define <32 x i8> @test_x86_avx2_packuswb(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_packuswb:			; AVX2-LABEL: test_x86_avx2_packuswb:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x67,0xc1]			; AVX2-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x67,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_packuswb:			; AVX512VL-LABEL: test_x86_avx2_packuswb:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x67,0xc1]			; AVX512VL-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x67,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.packuswb(<16 x i16> %a0, <16 x i16> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.packuswb(<16 x i16> %a0, <16 x i16> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.packuswb(<16 x i16>, <16 x i16>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.packuswb(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_padds_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_padds_b(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_padds_b:			; AVX2-LABEL: test_x86_avx2_padds_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpaddsb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xec,0xc1]			; AVX2-NEXT: vpaddsb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xec,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_padds_b:			; AVX512VL-LABEL: test_x86_avx2_padds_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpaddsb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xec,0xc1]			; AVX512VL-NEXT: vpaddsb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xec,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.padds.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.padds.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.padds.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.padds.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <16 x i16> @test_x86_avx2_padds_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_padds_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_padds_w:			; AVX2-LABEL: test_x86_avx2_padds_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpaddsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xed,0xc1]			; AVX2-NEXT: vpaddsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xed,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_padds_w:			; AVX512VL-LABEL: test_x86_avx2_padds_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpaddsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xed,0xc1]			; AVX512VL-NEXT: vpaddsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xed,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.padds.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.padds.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.padds.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.padds.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_paddus_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_paddus_b(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_paddus_b:			; AVX2-LABEL: test_x86_avx2_paddus_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpaddusb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xdc,0xc1]			; AVX2-NEXT: vpaddusb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xdc,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_paddus_b:			; AVX512VL-LABEL: test_x86_avx2_paddus_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpaddusb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xdc,0xc1]			; AVX512VL-NEXT: vpaddusb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xdc,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.paddus.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.paddus.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.paddus.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.paddus.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <16 x i16> @test_x86_avx2_paddus_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_paddus_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_paddus_w:			; AVX2-LABEL: test_x86_avx2_paddus_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpaddusw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xdd,0xc1]			; AVX2-NEXT: vpaddusw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xdd,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_paddus_w:			; AVX512VL-LABEL: test_x86_avx2_paddus_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpaddusw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xdd,0xc1]			; AVX512VL-NEXT: vpaddusw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xdd,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.paddus.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.paddus.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.paddus.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.paddus.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_pavg_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_pavg_b(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_pavg_b:			; AVX2-LABEL: test_x86_avx2_pavg_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpavgb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe0,0xc1]			; AVX2-NEXT: vpavgb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe0,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pavg_b:			; AVX512VL-LABEL: test_x86_avx2_pavg_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpavgb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe0,0xc1]			; AVX512VL-NEXT: vpavgb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe0,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.pavg.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.pavg.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.pavg.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.pavg.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <16 x i16> @test_x86_avx2_pavg_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_pavg_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pavg_w:			; AVX2-LABEL: test_x86_avx2_pavg_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpavgw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe3,0xc1]			; AVX2-NEXT: vpavgw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe3,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pavg_w:			; AVX512VL-LABEL: test_x86_avx2_pavg_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpavgw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe3,0xc1]			; AVX512VL-NEXT: vpavgw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe3,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pavg.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pavg.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pavg.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pavg.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <8 x i32> @test_x86_avx2_pmadd_wd(<16 x i16> %a0, <16 x i16> %a1) {			define <8 x i32> @test_x86_avx2_pmadd_wd(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmadd_wd:			; AVX2-LABEL: test_x86_avx2_pmadd_wd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaddwd %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf5,0xc1]			; AVX2-NEXT: vpmaddwd %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf5,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmadd_wd:			; AVX512VL-LABEL: test_x86_avx2_pmadd_wd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaddwd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xf5,0xc1]			; AVX512VL-NEXT: vpmaddwd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf5,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16> %a0, <16 x i16> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16> %a0, <16 x i16> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16>, <16 x i16>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16>, <16 x i16>) nounwind readnone


	define <16 x i16> @test_x86_avx2_pmaxs_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_pmaxs_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmaxs_w:			; AVX2-LABEL: test_x86_avx2_pmaxs_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xee,0xc1]			; AVX2-NEXT: vpmaxsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xee,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmaxs_w:			; AVX512VL-LABEL: test_x86_avx2_pmaxs_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xee,0xc1]			; AVX512VL-NEXT: vpmaxsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xee,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pmaxs.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pmaxs.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pmaxs.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pmaxs.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_pmaxu_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_pmaxu_b(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmaxu_b:			; AVX2-LABEL: test_x86_avx2_pmaxu_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxub %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xde,0xc1]			; AVX2-NEXT: vpmaxub %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xde,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmaxu_b:			; AVX512VL-LABEL: test_x86_avx2_pmaxu_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxub %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xde,0xc1]			; AVX512VL-NEXT: vpmaxub %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xde,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.pmaxu.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.pmaxu.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.pmaxu.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.pmaxu.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <16 x i16> @test_x86_avx2_pmins_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_pmins_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmins_w:			; AVX2-LABEL: test_x86_avx2_pmins_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xea,0xc1]			; AVX2-NEXT: vpminsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xea,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmins_w:			; AVX512VL-LABEL: test_x86_avx2_pmins_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xea,0xc1]			; AVX512VL-NEXT: vpminsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xea,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pmins.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pmins.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pmins.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pmins.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_pminu_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_pminu_b(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_pminu_b:			; AVX2-LABEL: test_x86_avx2_pminu_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminub %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xda,0xc1]			; AVX2-NEXT: vpminub %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xda,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pminu_b:			; AVX512VL-LABEL: test_x86_avx2_pminu_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminub %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xda,0xc1]			; AVX512VL-NEXT: vpminub %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xda,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.pminu.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.pminu.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.pminu.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.pminu.b(<32 x i8>, <32 x i8>) nounwind readnone


	define i32 @test_x86_avx2_pmovmskb(<32 x i8> %a0) {			define i32 @test_x86_avx2_pmovmskb(<32 x i8> %a0) {
	Show All 16 Lines
	define <16 x i16> @test_x86_avx2_pmulh_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_pmulh_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmulh_w:			; AVX2-LABEL: test_x86_avx2_pmulh_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmulhw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe5,0xc1]			; AVX2-NEXT: vpmulhw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe5,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmulh_w:			; AVX512VL-LABEL: test_x86_avx2_pmulh_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmulhw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe5,0xc1]			; AVX512VL-NEXT: vpmulhw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe5,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pmulh.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pmulh.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pmulh.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pmulh.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <16 x i16> @test_x86_avx2_pmulhu_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_pmulhu_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmulhu_w:			; AVX2-LABEL: test_x86_avx2_pmulhu_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmulhuw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe4,0xc1]			; AVX2-NEXT: vpmulhuw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe4,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmulhu_w:			; AVX512VL-LABEL: test_x86_avx2_pmulhu_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmulhuw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe4,0xc1]			; AVX512VL-NEXT: vpmulhuw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe4,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pmulhu.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pmulhu.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pmulhu.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pmulhu.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <4 x i64> @test_x86_avx2_pmulu_dq(<8 x i32> %a0, <8 x i32> %a1) {			define <4 x i64> @test_x86_avx2_pmulu_dq(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmulu_dq:			; AVX2-LABEL: test_x86_avx2_pmulu_dq:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmuludq %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf4,0xc1]			; AVX2-NEXT: vpmuludq %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf4,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmulu_dq:			; AVX512VL-LABEL: test_x86_avx2_pmulu_dq:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmuludq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xf4,0xc1]			; AVX512VL-NEXT: vpmuludq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf4,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.pmulu.dq(<8 x i32> %a0, <8 x i32> %a1) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.pmulu.dq(<8 x i32> %a0, <8 x i32> %a1) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.pmulu.dq(<8 x i32>, <8 x i32>) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.pmulu.dq(<8 x i32>, <8 x i32>) nounwind readnone


	define <4 x i64> @test_x86_avx2_psad_bw(<32 x i8> %a0, <32 x i8> %a1) {			define <4 x i64> @test_x86_avx2_psad_bw(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_psad_bw:			; AVX2-LABEL: test_x86_avx2_psad_bw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf6,0xc1]			; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psad_bw:			; AVX512VL-LABEL: test_x86_avx2_psad_bw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsadbw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xf6,0xc1]			; AVX512VL-NEXT: vpsadbw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf6,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.psad.bw(<32 x i8> %a0, <32 x i8> %a1) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.psad.bw(<32 x i8> %a0, <32 x i8> %a1) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.psad.bw(<32 x i8>, <32 x i8>) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.psad.bw(<32 x i8>, <32 x i8>) nounwind readnone


	define <8 x i32> @test_x86_avx2_psll_d(<8 x i32> %a0, <4 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psll_d(<8 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psll_d:			; AVX2-LABEL: test_x86_avx2_psll_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpslld %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf2,0xc1]			; AVX2-NEXT: vpslld %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf2,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psll_d:			; AVX512VL-LABEL: test_x86_avx2_psll_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpslld %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xf2,0xc1]			; AVX512VL-NEXT: vpslld %xmm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf2,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psll.d(<8 x i32> %a0, <4 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psll.d(<8 x i32> %a0, <4 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psll.d(<8 x i32>, <4 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psll.d(<8 x i32>, <4 x i32>) nounwind readnone


	define <4 x i64> @test_x86_avx2_psll_q(<4 x i64> %a0, <2 x i64> %a1) {			define <4 x i64> @test_x86_avx2_psll_q(<4 x i64> %a0, <2 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psll_q:			; AVX2-LABEL: test_x86_avx2_psll_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllq %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf3,0xc1]			; AVX2-NEXT: vpsllq %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf3,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psll_q:			; AVX512VL-LABEL: test_x86_avx2_psll_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllq %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xf3,0xc1]			; AVX512VL-NEXT: vpsllq %xmm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf3,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.psll.q(<4 x i64> %a0, <2 x i64> %a1) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.psll.q(<4 x i64> %a0, <2 x i64> %a1) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.psll.q(<4 x i64>, <2 x i64>) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.psll.q(<4 x i64>, <2 x i64>) nounwind readnone


	define <16 x i16> @test_x86_avx2_psll_w(<16 x i16> %a0, <8 x i16> %a1) {			define <16 x i16> @test_x86_avx2_psll_w(<16 x i16> %a0, <8 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_psll_w:			; AVX2-LABEL: test_x86_avx2_psll_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllw %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf1,0xc1]			; AVX2-NEXT: vpsllw %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xf1,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psll_w:			; AVX512VL-LABEL: test_x86_avx2_psll_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllw %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xf1,0xc1]			; AVX512VL-NEXT: vpsllw %xmm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf1,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.psll.w(<16 x i16> %a0, <8 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.psll.w(<16 x i16> %a0, <8 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.psll.w(<16 x i16>, <8 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.psll.w(<16 x i16>, <8 x i16>) nounwind readnone


	define <8 x i32> @test_x86_avx2_pslli_d(<8 x i32> %a0) {			define <8 x i32> @test_x86_avx2_pslli_d(<8 x i32> %a0) {
	; AVX2-LABEL: test_x86_avx2_pslli_d:			; AVX2-LABEL: test_x86_avx2_pslli_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpslld $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x72,0xf0,0x07]			; AVX2-NEXT: vpslld $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x72,0xf0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pslli_d:			; AVX512VL-LABEL: test_x86_avx2_pslli_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpslld $7, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x72,0xf0,0x07]			; AVX512VL-NEXT: vpslld $7, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x72,0xf0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.pslli.d(<8 x i32> %a0, i32 7) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.pslli.d(<8 x i32> %a0, i32 7) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.pslli.d(<8 x i32>, i32) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.pslli.d(<8 x i32>, i32) nounwind readnone


	define <4 x i64> @test_x86_avx2_pslli_q(<4 x i64> %a0) {			define <4 x i64> @test_x86_avx2_pslli_q(<4 x i64> %a0) {
	; AVX2-LABEL: test_x86_avx2_pslli_q:			; AVX2-LABEL: test_x86_avx2_pslli_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllq $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x73,0xf0,0x07]			; AVX2-NEXT: vpsllq $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x73,0xf0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pslli_q:			; AVX512VL-LABEL: test_x86_avx2_pslli_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllq $7, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x73,0xf0,0x07]			; AVX512VL-NEXT: vpsllq $7, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x73,0xf0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.pslli.q(<4 x i64> %a0, i32 7) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.pslli.q(<4 x i64> %a0, i32 7) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.pslli.q(<4 x i64>, i32) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.pslli.q(<4 x i64>, i32) nounwind readnone


	define <16 x i16> @test_x86_avx2_pslli_w(<16 x i16> %a0) {			define <16 x i16> @test_x86_avx2_pslli_w(<16 x i16> %a0) {
	; AVX2-LABEL: test_x86_avx2_pslli_w:			; AVX2-LABEL: test_x86_avx2_pslli_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllw $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x71,0xf0,0x07]			; AVX2-NEXT: vpsllw $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x71,0xf0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pslli_w:			; AVX512VL-LABEL: test_x86_avx2_pslli_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllw $7, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x71,0xf0,0x07]			; AVX512VL-NEXT: vpsllw $7, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x71,0xf0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pslli.w(<16 x i16> %a0, i32 7) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pslli.w(<16 x i16> %a0, i32 7) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pslli.w(<16 x i16>, i32) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pslli.w(<16 x i16>, i32) nounwind readnone


	define <8 x i32> @test_x86_avx2_psra_d(<8 x i32> %a0, <4 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psra_d(<8 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psra_d:			; AVX2-LABEL: test_x86_avx2_psra_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrad %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe2,0xc1]			; AVX2-NEXT: vpsrad %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe2,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psra_d:			; AVX512VL-LABEL: test_x86_avx2_psra_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrad %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe2,0xc1]			; AVX512VL-NEXT: vpsrad %xmm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe2,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psra.d(<8 x i32> %a0, <4 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psra.d(<8 x i32> %a0, <4 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psra.d(<8 x i32>, <4 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psra.d(<8 x i32>, <4 x i32>) nounwind readnone


	define <16 x i16> @test_x86_avx2_psra_w(<16 x i16> %a0, <8 x i16> %a1) {			define <16 x i16> @test_x86_avx2_psra_w(<16 x i16> %a0, <8 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_psra_w:			; AVX2-LABEL: test_x86_avx2_psra_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsraw %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe1,0xc1]			; AVX2-NEXT: vpsraw %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe1,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psra_w:			; AVX512VL-LABEL: test_x86_avx2_psra_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsraw %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe1,0xc1]			; AVX512VL-NEXT: vpsraw %xmm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe1,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.psra.w(<16 x i16> %a0, <8 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.psra.w(<16 x i16> %a0, <8 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.psra.w(<16 x i16>, <8 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.psra.w(<16 x i16>, <8 x i16>) nounwind readnone


	define <8 x i32> @test_x86_avx2_psrai_d(<8 x i32> %a0) {			define <8 x i32> @test_x86_avx2_psrai_d(<8 x i32> %a0) {
	; AVX2-LABEL: test_x86_avx2_psrai_d:			; AVX2-LABEL: test_x86_avx2_psrai_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrad $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x72,0xe0,0x07]			; AVX2-NEXT: vpsrad $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x72,0xe0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrai_d:			; AVX512VL-LABEL: test_x86_avx2_psrai_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrad $7, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x72,0xe0,0x07]			; AVX512VL-NEXT: vpsrad $7, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x72,0xe0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrai.d(<8 x i32> %a0, i32 7) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psrai.d(<8 x i32> %a0, i32 7) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psrai.d(<8 x i32>, i32) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psrai.d(<8 x i32>, i32) nounwind readnone


	define <16 x i16> @test_x86_avx2_psrai_w(<16 x i16> %a0) {			define <16 x i16> @test_x86_avx2_psrai_w(<16 x i16> %a0) {
	; AVX2-LABEL: test_x86_avx2_psrai_w:			; AVX2-LABEL: test_x86_avx2_psrai_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsraw $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x71,0xe0,0x07]			; AVX2-NEXT: vpsraw $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x71,0xe0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrai_w:			; AVX512VL-LABEL: test_x86_avx2_psrai_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsraw $7, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x71,0xe0,0x07]			; AVX512VL-NEXT: vpsraw $7, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x71,0xe0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.psrai.w(<16 x i16> %a0, i32 7) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.psrai.w(<16 x i16> %a0, i32 7) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.psrai.w(<16 x i16>, i32) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.psrai.w(<16 x i16>, i32) nounwind readnone


	define <8 x i32> @test_x86_avx2_psrl_d(<8 x i32> %a0, <4 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psrl_d(<8 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrl_d:			; AVX2-LABEL: test_x86_avx2_psrl_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrld %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd2,0xc1]			; AVX2-NEXT: vpsrld %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd2,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrl_d:			; AVX512VL-LABEL: test_x86_avx2_psrl_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrld %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd2,0xc1]			; AVX512VL-NEXT: vpsrld %xmm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd2,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrl.d(<8 x i32> %a0, <4 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psrl.d(<8 x i32> %a0, <4 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psrl.d(<8 x i32>, <4 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psrl.d(<8 x i32>, <4 x i32>) nounwind readnone


	define <4 x i64> @test_x86_avx2_psrl_q(<4 x i64> %a0, <2 x i64> %a1) {			define <4 x i64> @test_x86_avx2_psrl_q(<4 x i64> %a0, <2 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrl_q:			; AVX2-LABEL: test_x86_avx2_psrl_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlq %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd3,0xc1]			; AVX2-NEXT: vpsrlq %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd3,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrl_q:			; AVX512VL-LABEL: test_x86_avx2_psrl_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlq %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd3,0xc1]			; AVX512VL-NEXT: vpsrlq %xmm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd3,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.psrl.q(<4 x i64> %a0, <2 x i64> %a1) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.psrl.q(<4 x i64> %a0, <2 x i64> %a1) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.psrl.q(<4 x i64>, <2 x i64>) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.psrl.q(<4 x i64>, <2 x i64>) nounwind readnone


	define <16 x i16> @test_x86_avx2_psrl_w(<16 x i16> %a0, <8 x i16> %a1) {			define <16 x i16> @test_x86_avx2_psrl_w(<16 x i16> %a0, <8 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrl_w:			; AVX2-LABEL: test_x86_avx2_psrl_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlw %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd1,0xc1]			; AVX2-NEXT: vpsrlw %xmm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd1,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrl_w:			; AVX512VL-LABEL: test_x86_avx2_psrl_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlw %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd1,0xc1]			; AVX512VL-NEXT: vpsrlw %xmm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd1,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.psrl.w(<16 x i16> %a0, <8 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.psrl.w(<16 x i16> %a0, <8 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.psrl.w(<16 x i16>, <8 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.psrl.w(<16 x i16>, <8 x i16>) nounwind readnone


	define <8 x i32> @test_x86_avx2_psrli_d(<8 x i32> %a0) {			define <8 x i32> @test_x86_avx2_psrli_d(<8 x i32> %a0) {
	; AVX2-LABEL: test_x86_avx2_psrli_d:			; AVX2-LABEL: test_x86_avx2_psrli_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrld $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x72,0xd0,0x07]			; AVX2-NEXT: vpsrld $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x72,0xd0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrli_d:			; AVX512VL-LABEL: test_x86_avx2_psrli_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrld $7, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x72,0xd0,0x07]			; AVX512VL-NEXT: vpsrld $7, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x72,0xd0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrli.d(<8 x i32> %a0, i32 7) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psrli.d(<8 x i32> %a0, i32 7) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psrli.d(<8 x i32>, i32) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psrli.d(<8 x i32>, i32) nounwind readnone


	define <4 x i64> @test_x86_avx2_psrli_q(<4 x i64> %a0) {			define <4 x i64> @test_x86_avx2_psrli_q(<4 x i64> %a0) {
	; AVX2-LABEL: test_x86_avx2_psrli_q:			; AVX2-LABEL: test_x86_avx2_psrli_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlq $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x73,0xd0,0x07]			; AVX2-NEXT: vpsrlq $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x73,0xd0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrli_q:			; AVX512VL-LABEL: test_x86_avx2_psrli_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlq $7, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x73,0xd0,0x07]			; AVX512VL-NEXT: vpsrlq $7, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x73,0xd0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.psrli.q(<4 x i64> %a0, i32 7) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.psrli.q(<4 x i64> %a0, i32 7) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.psrli.q(<4 x i64>, i32) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.psrli.q(<4 x i64>, i32) nounwind readnone


	define <16 x i16> @test_x86_avx2_psrli_w(<16 x i16> %a0) {			define <16 x i16> @test_x86_avx2_psrli_w(<16 x i16> %a0) {
	; AVX2-LABEL: test_x86_avx2_psrli_w:			; AVX2-LABEL: test_x86_avx2_psrli_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlw $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x71,0xd0,0x07]			; AVX2-NEXT: vpsrlw $7, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0x71,0xd0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrli_w:			; AVX512VL-LABEL: test_x86_avx2_psrli_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlw $7, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x71,0xd0,0x07]			; AVX512VL-NEXT: vpsrlw $7, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x71,0xd0,0x07]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.psrli.w(<16 x i16> %a0, i32 7) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.psrli.w(<16 x i16> %a0, i32 7) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.psrli.w(<16 x i16>, i32) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.psrli.w(<16 x i16>, i32) nounwind readnone


	define <32 x i8> @test_x86_avx2_psubs_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_psubs_b(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_psubs_b:			; AVX2-LABEL: test_x86_avx2_psubs_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsubsb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe8,0xc1]			; AVX2-NEXT: vpsubsb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe8,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psubs_b:			; AVX512VL-LABEL: test_x86_avx2_psubs_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsubsb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe8,0xc1]			; AVX512VL-NEXT: vpsubsb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe8,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.psubs.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.psubs.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.psubs.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.psubs.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <16 x i16> @test_x86_avx2_psubs_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_psubs_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_psubs_w:			; AVX2-LABEL: test_x86_avx2_psubs_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsubsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe9,0xc1]			; AVX2-NEXT: vpsubsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xe9,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psubs_w:			; AVX512VL-LABEL: test_x86_avx2_psubs_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsubsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe9,0xc1]			; AVX512VL-NEXT: vpsubsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe9,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.psubs.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.psubs.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.psubs.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.psubs.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_psubus_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_psubus_b(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_psubus_b:			; AVX2-LABEL: test_x86_avx2_psubus_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsubusb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd8,0xc1]			; AVX2-NEXT: vpsubusb %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd8,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psubus_b:			; AVX512VL-LABEL: test_x86_avx2_psubus_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsubusb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd8,0xc1]			; AVX512VL-NEXT: vpsubusb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd8,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.psubus.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.psubus.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.psubus.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.psubus.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <16 x i16> @test_x86_avx2_psubus_w(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_psubus_w(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_psubus_w:			; AVX2-LABEL: test_x86_avx2_psubus_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsubusw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd9,0xc1]			; AVX2-NEXT: vpsubusw %ymm1, %ymm0, %ymm0 ## encoding: [0xc5,0xfd,0xd9,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psubus_w:			; AVX512VL-LABEL: test_x86_avx2_psubus_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsubusw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd9,0xc1]			; AVX512VL-NEXT: vpsubusw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd9,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.psubus.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.psubus.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.psubus.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.psubus.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_pabs_b(<32 x i8> %a0) {			define <32 x i8> @test_x86_avx2_pabs_b(<32 x i8> %a0) {
	; AVX2-LABEL: test_x86_avx2_pabs_b:			; AVX2-LABEL: test_x86_avx2_pabs_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpabsb %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x1c,0xc0]			; AVX2-NEXT: vpabsb %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x1c,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pabs_b:			; AVX512VL-LABEL: test_x86_avx2_pabs_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpabsb %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x1c,0xc0]			; AVX512VL-NEXT: vpabsb %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x1c,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.pabs.b(<32 x i8> %a0) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.pabs.b(<32 x i8> %a0) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.pabs.b(<32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.pabs.b(<32 x i8>) nounwind readnone


	define <8 x i32> @test_x86_avx2_pabs_d(<8 x i32> %a0) {			define <8 x i32> @test_x86_avx2_pabs_d(<8 x i32> %a0) {
	; AVX2-LABEL: test_x86_avx2_pabs_d:			; AVX2-LABEL: test_x86_avx2_pabs_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpabsd %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x1e,0xc0]			; AVX2-NEXT: vpabsd %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x1e,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pabs_d:			; AVX512VL-LABEL: test_x86_avx2_pabs_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpabsd %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x1e,0xc0]			; AVX512VL-NEXT: vpabsd %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x1e,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.pabs.d(<8 x i32> %a0) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.pabs.d(<8 x i32> %a0) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.pabs.d(<8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.pabs.d(<8 x i32>) nounwind readnone


	define <16 x i16> @test_x86_avx2_pabs_w(<16 x i16> %a0) {			define <16 x i16> @test_x86_avx2_pabs_w(<16 x i16> %a0) {
	; AVX2-LABEL: test_x86_avx2_pabs_w:			; AVX2-LABEL: test_x86_avx2_pabs_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpabsw %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x1d,0xc0]			; AVX2-NEXT: vpabsw %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x1d,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pabs_w:			; AVX512VL-LABEL: test_x86_avx2_pabs_w:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpabsw %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x1d,0xc0]			; AVX512VL-NEXT: vpabsw %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x1d,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pabs.w(<16 x i16> %a0) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pabs.w(<16 x i16> %a0) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pabs.w(<16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pabs.w(<16 x i16>) nounwind readnone


	define <8 x i32> @test_x86_avx2_phadd_d(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_phadd_d(<8 x i32> %a0, <8 x i32> %a1) {
	▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
	define <16 x i16> @test_x86_avx2_pmadd_ub_sw(<32 x i8> %a0, <32 x i8> %a1) {			define <16 x i16> @test_x86_avx2_pmadd_ub_sw(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmadd_ub_sw:			; AVX2-LABEL: test_x86_avx2_pmadd_ub_sw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaddubsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x04,0xc1]			; AVX2-NEXT: vpmaddubsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x04,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmadd_ub_sw:			; AVX512VL-LABEL: test_x86_avx2_pmadd_ub_sw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaddubsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x04,0xc1]			; AVX512VL-NEXT: vpmaddubsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x04,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8> %a0, <32 x i8> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8> %a0, <32 x i8> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8>, <32 x i8>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8>, <32 x i8>) nounwind readnone

	; Make sure we don't commute this operation.			; Make sure we don't commute this operation.
	define <16 x i16> @test_x86_avx2_pmadd_ub_sw_load_op0(<32 x i8>* %ptr, <32 x i8> %a1) {			define <16 x i16> @test_x86_avx2_pmadd_ub_sw_load_op0(<32 x i8>* %ptr, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmadd_ub_sw_load_op0:			; AVX2-LABEL: test_x86_avx2_pmadd_ub_sw_load_op0:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX2-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX2-NEXT: vmovdqa (%eax), %ymm1 ## encoding: [0xc5,0xfd,0x6f,0x08]			; AVX2-NEXT: vmovdqa (%eax), %ymm1 ## encoding: [0xc5,0xfd,0x6f,0x08]
	; AVX2-NEXT: vpmaddubsw %ymm0, %ymm1, %ymm0 ## encoding: [0xc4,0xe2,0x75,0x04,0xc0]			; AVX2-NEXT: vpmaddubsw %ymm0, %ymm1, %ymm0 ## encoding: [0xc4,0xe2,0x75,0x04,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmadd_ub_sw_load_op0:			; AVX512VL-LABEL: test_x86_avx2_pmadd_ub_sw_load_op0:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vmovdqu8 (%eax), %ymm1 ## encoding: [0x62,0xf1,0x7f,0x28,0x6f,0x08]			; AVX512VL-NEXT: vmovdqu (%eax), %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x6f,0x08]
	; AVX512VL-NEXT: vpmaddubsw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0x75,0x28,0x04,0xc0]			; AVX512VL-NEXT: vpmaddubsw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x75,0x04,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%a0 = load <32 x i8>, <32 x i8>* %ptr			%a0 = load <32 x i8>, <32 x i8>* %ptr
	%res = call <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8> %a0, <32 x i8> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8> %a0, <32 x i8> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}

	define <16 x i16> @test_x86_avx2_pmul_hr_sw(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_pmul_hr_sw(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmul_hr_sw:			; AVX2-LABEL: test_x86_avx2_pmul_hr_sw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmulhrsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x0b,0xc1]			; AVX2-NEXT: vpmulhrsw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x0b,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmul_hr_sw:			; AVX512VL-LABEL: test_x86_avx2_pmul_hr_sw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmulhrsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x0b,0xc1]			; AVX512VL-NEXT: vpmulhrsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x0b,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pmul.hr.sw(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pmul.hr.sw(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pmul.hr.sw(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pmul.hr.sw(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_pshuf_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_pshuf_b(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_pshuf_b:			; AVX2-LABEL: test_x86_avx2_pshuf_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpshufb %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x00,0xc1]			; AVX2-NEXT: vpshufb %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x00,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pshuf_b:			; AVX512VL-LABEL: test_x86_avx2_pshuf_b:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpshufb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x00,0xc1]			; AVX512VL-NEXT: vpshufb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x00,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.pshuf.b(<32 x i8> %a0, <32 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.pshuf.b(<32 x i8> %a0, <32 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <32 x i8> @test_x86_avx2_psign_b(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_psign_b(<32 x i8> %a0, <32 x i8> %a1) {
	Show All 34 Lines
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX2-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX2-NEXT: vmovntdqa (%eax), %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x2a,0x00]			; AVX2-NEXT: vmovntdqa (%eax), %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x2a,0x00]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_movntdqa:			; AVX512VL-LABEL: test_x86_avx2_movntdqa:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vmovntdqa (%eax), %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x2a,0x00]			; AVX512VL-NEXT: vmovntdqa (%eax), %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x2a,0x00]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.movntdqa(i8* %a0) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.movntdqa(i8* %a0) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.movntdqa(i8*) nounwind readonly			declare <4 x i64> @llvm.x86.avx2.movntdqa(i8*) nounwind readonly


	define <16 x i16> @test_x86_avx2_mpsadbw(<32 x i8> %a0, <32 x i8> %a1) {			define <16 x i16> @test_x86_avx2_mpsadbw(<32 x i8> %a0, <32 x i8> %a1) {
	Show All 10 Lines
	define <16 x i16> @test_x86_avx2_packusdw(<8 x i32> %a0, <8 x i32> %a1) {			define <16 x i16> @test_x86_avx2_packusdw(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_packusdw:			; AVX2-LABEL: test_x86_avx2_packusdw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x2b,0xc1]			; AVX2-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x2b,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_packusdw:			; AVX512VL-LABEL: test_x86_avx2_packusdw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x2b,0xc1]			; AVX512VL-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x2b,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.packusdw(<8 x i32> %a0, <8 x i32> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.packusdw(<8 x i32> %a0, <8 x i32> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.packusdw(<8 x i32>, <8 x i32>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.packusdw(<8 x i32>, <8 x i32>) nounwind readnone


	define <32 x i8> @test_x86_avx2_pblendvb(<32 x i8> %a0, <32 x i8> %a1, <32 x i8> %a2) {			define <32 x i8> @test_x86_avx2_pblendvb(<32 x i8> %a0, <32 x i8> %a1, <32 x i8> %a2) {
	Show All 22 Lines
	define <32 x i8> @test_x86_avx2_pmaxsb(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_pmaxsb(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmaxsb:			; AVX2-LABEL: test_x86_avx2_pmaxsb:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxsb %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3c,0xc1]			; AVX2-NEXT: vpmaxsb %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3c,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmaxsb:			; AVX512VL-LABEL: test_x86_avx2_pmaxsb:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxsb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x3c,0xc1]			; AVX512VL-NEXT: vpmaxsb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3c,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.pmaxs.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <8 x i32> @test_x86_avx2_pmaxsd(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_pmaxsd(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmaxsd:			; AVX2-LABEL: test_x86_avx2_pmaxsd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3d,0xc1]			; AVX2-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3d,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmaxsd:			; AVX512VL-LABEL: test_x86_avx2_pmaxsd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x3d,0xc1]			; AVX512VL-NEXT: vpmaxsd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3d,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.pmaxs.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.pmaxs.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.pmaxs.d(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.pmaxs.d(<8 x i32>, <8 x i32>) nounwind readnone


	define <8 x i32> @test_x86_avx2_pmaxud(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_pmaxud(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmaxud:			; AVX2-LABEL: test_x86_avx2_pmaxud:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxud %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3f,0xc1]			; AVX2-NEXT: vpmaxud %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3f,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmaxud:			; AVX512VL-LABEL: test_x86_avx2_pmaxud:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxud %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x3f,0xc1]			; AVX512VL-NEXT: vpmaxud %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3f,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.pmaxu.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.pmaxu.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.pmaxu.d(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.pmaxu.d(<8 x i32>, <8 x i32>) nounwind readnone


	define <16 x i16> @test_x86_avx2_pmaxuw(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_pmaxuw(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pmaxuw:			; AVX2-LABEL: test_x86_avx2_pmaxuw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxuw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3e,0xc1]			; AVX2-NEXT: vpmaxuw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3e,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pmaxuw:			; AVX512VL-LABEL: test_x86_avx2_pmaxuw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpmaxuw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x3e,0xc1]			; AVX512VL-NEXT: vpmaxuw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3e,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pmaxu.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pmaxu.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pmaxu.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pmaxu.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <32 x i8> @test_x86_avx2_pminsb(<32 x i8> %a0, <32 x i8> %a1) {			define <32 x i8> @test_x86_avx2_pminsb(<32 x i8> %a0, <32 x i8> %a1) {
	; AVX2-LABEL: test_x86_avx2_pminsb:			; AVX2-LABEL: test_x86_avx2_pminsb:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminsb %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x38,0xc1]			; AVX2-NEXT: vpminsb %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x38,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pminsb:			; AVX512VL-LABEL: test_x86_avx2_pminsb:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminsb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x38,0xc1]			; AVX512VL-NEXT: vpminsb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x38,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx2.pmins.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]			%res = call <32 x i8> @llvm.x86.avx2.pmins.b(<32 x i8> %a0, <32 x i8> %a1) ; <<32 x i8>> [#uses=1]
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.x86.avx2.pmins.b(<32 x i8>, <32 x i8>) nounwind readnone			declare <32 x i8> @llvm.x86.avx2.pmins.b(<32 x i8>, <32 x i8>) nounwind readnone


	define <8 x i32> @test_x86_avx2_pminsd(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_pminsd(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_pminsd:			; AVX2-LABEL: test_x86_avx2_pminsd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminsd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x39,0xc1]			; AVX2-NEXT: vpminsd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x39,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pminsd:			; AVX512VL-LABEL: test_x86_avx2_pminsd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminsd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x39,0xc1]			; AVX512VL-NEXT: vpminsd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x39,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.pmins.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.pmins.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.pmins.d(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.pmins.d(<8 x i32>, <8 x i32>) nounwind readnone


	define <8 x i32> @test_x86_avx2_pminud(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_pminud(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_pminud:			; AVX2-LABEL: test_x86_avx2_pminud:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminud %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3b,0xc1]			; AVX2-NEXT: vpminud %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3b,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pminud:			; AVX512VL-LABEL: test_x86_avx2_pminud:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminud %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x3b,0xc1]			; AVX512VL-NEXT: vpminud %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3b,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.pminu.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.pminu.d(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.pminu.d(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.pminu.d(<8 x i32>, <8 x i32>) nounwind readnone


	define <16 x i16> @test_x86_avx2_pminuw(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @test_x86_avx2_pminuw(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX2-LABEL: test_x86_avx2_pminuw:			; AVX2-LABEL: test_x86_avx2_pminuw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminuw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3a,0xc1]			; AVX2-NEXT: vpminuw %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x3a,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_pminuw:			; AVX512VL-LABEL: test_x86_avx2_pminuw:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpminuw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x3a,0xc1]			; AVX512VL-NEXT: vpminuw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3a,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i16> @llvm.x86.avx2.pminu.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]			%res = call <16 x i16> @llvm.x86.avx2.pminu.w(<16 x i16> %a0, <16 x i16> %a1) ; <<16 x i16>> [#uses=1]
	ret <16 x i16> %res			ret <16 x i16> %res
	}			}
	declare <16 x i16> @llvm.x86.avx2.pminu.w(<16 x i16>, <16 x i16>) nounwind readnone			declare <16 x i16> @llvm.x86.avx2.pminu.w(<16 x i16>, <16 x i16>) nounwind readnone


	define <4 x i64> @test_x86_avx2_pmul.dq(<8 x i32> %a0, <8 x i32> %a1) {			define <4 x i64> @test_x86_avx2_pmul.dq(<8 x i32> %a0, <8 x i32> %a1) {
	Show All 33 Lines
	define <8 x i32> @test_x86_avx2_permd(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_permd(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_permd:			; AVX2-LABEL: test_x86_avx2_permd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0 ## encoding: [0xc4,0xe2,0x75,0x36,0xc0]			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0 ## encoding: [0xc4,0xe2,0x75,0x36,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_permd:			; AVX512VL-LABEL: test_x86_avx2_permd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0x75,0x28,0x36,0xc0]			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x75,0x36,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.permd(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.permd(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.permd(<8 x i32>, <8 x i32>) nounwind readonly			declare <8 x i32> @llvm.x86.avx2.permd(<8 x i32>, <8 x i32>) nounwind readonly


	; Check that the arguments are swapped between the intrinsic definition			; Check that the arguments are swapped between the intrinsic definition
	; and its lowering. Indeed, the offsets are the first source in			; and its lowering. Indeed, the offsets are the first source in
	; the instruction.			; the instruction.
	define <8 x float> @test_x86_avx2_permps(<8 x float> %a0, <8 x i32> %a1) {			define <8 x float> @test_x86_avx2_permps(<8 x float> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_permps:			; AVX2-LABEL: test_x86_avx2_permps:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpermps %ymm0, %ymm1, %ymm0 ## encoding: [0xc4,0xe2,0x75,0x16,0xc0]			; AVX2-NEXT: vpermps %ymm0, %ymm1, %ymm0 ## encoding: [0xc4,0xe2,0x75,0x16,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_permps:			; AVX512VL-LABEL: test_x86_avx2_permps:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpermps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0x75,0x28,0x16,0xc0]			; AVX512VL-NEXT: vpermps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x75,0x16,0xc0]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x float> @llvm.x86.avx2.permps(<8 x float> %a0, <8 x i32> %a1) ; <<8 x float>> [#uses=1]			%res = call <8 x float> @llvm.x86.avx2.permps(<8 x float> %a0, <8 x i32> %a1) ; <<8 x float>> [#uses=1]
	ret <8 x float> %res			ret <8 x float> %res
	}			}
	declare <8 x float> @llvm.x86.avx2.permps(<8 x float>, <8 x i32>) nounwind readonly			declare <8 x float> @llvm.x86.avx2.permps(<8 x float>, <8 x i32>) nounwind readonly


	define <4 x i64> @test_x86_avx2_vperm2i128(<4 x i64> %a0, <4 x i64> %a1) {			define <4 x i64> @test_x86_avx2_vperm2i128(<4 x i64> %a0, <4 x i64> %a1) {
	▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines
	define <4 x i32> @test_x86_avx2_psllv_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_avx2_psllv_d(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psllv_d:			; AVX2-LABEL: test_x86_avx2_psllv_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllvd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x47,0xc1]			; AVX2-NEXT: vpsllvd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x47,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psllv_d:			; AVX512VL-LABEL: test_x86_avx2_psllv_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllvd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x47,0xc1]			; AVX512VL-NEXT: vpsllvd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x47,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.avx2.psllv.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i32> @test_x86_avx2_psllv_d_256(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psllv_d_256(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psllv_d_256:			; AVX2-LABEL: test_x86_avx2_psllv_d_256:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x47,0xc1]			; AVX2-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x47,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psllv_d_256:			; AVX512VL-LABEL: test_x86_avx2_psllv_d_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x47,0xc1]			; AVX512VL-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x47,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psllv.d.256(<8 x i32>, <8 x i32>) nounwind readnone


	define <2 x i64> @test_x86_avx2_psllv_q(<2 x i64> %a0, <2 x i64> %a1) {			define <2 x i64> @test_x86_avx2_psllv_q(<2 x i64> %a0, <2 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psllv_q:			; AVX2-LABEL: test_x86_avx2_psllv_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x47,0xc1]			; AVX2-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x47,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psllv_q:			; AVX512VL-LABEL: test_x86_avx2_psllv_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x47,0xc1]			; AVX512VL-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x47,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64>, <2 x i64>) nounwind readnone			declare <2 x i64> @llvm.x86.avx2.psllv.q(<2 x i64>, <2 x i64>) nounwind readnone


	define <4 x i64> @test_x86_avx2_psllv_q_256(<4 x i64> %a0, <4 x i64> %a1) {			define <4 x i64> @test_x86_avx2_psllv_q_256(<4 x i64> %a0, <4 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psllv_q_256:			; AVX2-LABEL: test_x86_avx2_psllv_q_256:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x47,0xc1]			; AVX2-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x47,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psllv_q_256:			; AVX512VL-LABEL: test_x86_avx2_psllv_q_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x47,0xc1]			; AVX512VL-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x47,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64>, <4 x i64>) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.psllv.q.256(<4 x i64>, <4 x i64>) nounwind readnone


	define <4 x i32> @test_x86_avx2_psrlv_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_avx2_psrlv_d(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrlv_d:			; AVX2-LABEL: test_x86_avx2_psrlv_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x45,0xc1]			; AVX2-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x45,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrlv_d:			; AVX512VL-LABEL: test_x86_avx2_psrlv_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x45,0xc1]			; AVX512VL-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x45,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.avx2.psrlv.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i32> @test_x86_avx2_psrlv_d_256(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psrlv_d_256(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrlv_d_256:			; AVX2-LABEL: test_x86_avx2_psrlv_d_256:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x45,0xc1]			; AVX2-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x45,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrlv_d_256:			; AVX512VL-LABEL: test_x86_avx2_psrlv_d_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x45,0xc1]			; AVX512VL-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x45,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psrlv.d.256(<8 x i32>, <8 x i32>) nounwind readnone


	define <2 x i64> @test_x86_avx2_psrlv_q(<2 x i64> %a0, <2 x i64> %a1) {			define <2 x i64> @test_x86_avx2_psrlv_q(<2 x i64> %a0, <2 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrlv_q:			; AVX2-LABEL: test_x86_avx2_psrlv_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x45,0xc1]			; AVX2-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0xf9,0x45,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrlv_q:			; AVX512VL-LABEL: test_x86_avx2_psrlv_q:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x45,0xc1]			; AVX512VL-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x45,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64>, <2 x i64>) nounwind readnone			declare <2 x i64> @llvm.x86.avx2.psrlv.q(<2 x i64>, <2 x i64>) nounwind readnone


	define <4 x i64> @test_x86_avx2_psrlv_q_256(<4 x i64> %a0, <4 x i64> %a1) {			define <4 x i64> @test_x86_avx2_psrlv_q_256(<4 x i64> %a0, <4 x i64> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrlv_q_256:			; AVX2-LABEL: test_x86_avx2_psrlv_q_256:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x45,0xc1]			; AVX2-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0xfd,0x45,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrlv_q_256:			; AVX512VL-LABEL: test_x86_avx2_psrlv_q_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x45,0xc1]			; AVX512VL-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x45,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1]			%res = call <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64> %a0, <4 x i64> %a1) ; <<4 x i64>> [#uses=1]
	ret <4 x i64> %res			ret <4 x i64> %res
	}			}
	declare <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64>, <4 x i64>) nounwind readnone			declare <4 x i64> @llvm.x86.avx2.psrlv.q.256(<4 x i64>, <4 x i64>) nounwind readnone


	define <4 x i32> @test_x86_avx2_psrav_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_avx2_psrav_d(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrav_d:			; AVX2-LABEL: test_x86_avx2_psrav_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsravd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0xc1]			; AVX2-NEXT: vpsravd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrav_d:			; AVX512VL-LABEL: test_x86_avx2_psrav_d:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsravd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x46,0xc1]			; AVX512VL-NEXT: vpsravd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}

	define <4 x i32> @test_x86_avx2_psrav_d_const(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_avx2_psrav_d_const(<4 x i32> %a0, <4 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrav_d_const:			; AVX2-LABEL: test_x86_avx2_psrav_d_const:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} xmm0 = [2,9,4294967284,23]			; AVX2-NEXT: vmovdqa {{.*#+}} xmm0 = [2,9,4294967284,23]
	; AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]			; AVX2-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
	; AVX2-NEXT: ## fixup A - offset: 4, value: LCPI91_0, kind: FK_Data_4			; AVX2-NEXT: ## fixup A - offset: 4, value: LCPI91_0, kind: FK_Data_4
	; AVX2-NEXT: vpsravd LCPI91_1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]			; AVX2-NEXT: vpsravd LCPI91_1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]
	; AVX2-NEXT: ## fixup A - offset: 5, value: LCPI91_1, kind: FK_Data_4			; AVX2-NEXT: ## fixup A - offset: 5, value: LCPI91_1, kind: FK_Data_4
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrav_d_const:			; AVX512VL-LABEL: test_x86_avx2_psrav_d_const:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} xmm0 = [2,9,4294967284,23]			; AVX512VL-NEXT: vmovdqa LCPI91_0, %xmm0 ## EVEX TO VEX Compression xmm0 = [2,9,4294967284,23]
	; AVX512VL-NEXT: ## encoding: [0x62,0xf1,0x7d,0x08,0x6f,0x05,A,A,A,A]			; AVX512VL-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
	; AVX512VL-NEXT: ## fixup A - offset: 6, value: LCPI91_0, kind: FK_Data_4			; AVX512VL-NEXT: ## fixup A - offset: 4, value: LCPI91_0, kind: FK_Data_4
	; AVX512VL-NEXT: vpsravd LCPI91_1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x46,0x05,A,A,A,A]			; AVX512VL-NEXT: vpsravd LCPI91_1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0x05,A,A,A,A]
	; AVX512VL-NEXT: ## fixup A - offset: 6, value: LCPI91_1, kind: FK_Data_4			; AVX512VL-NEXT: ## fixup A - offset: 5, value: LCPI91_1, kind: FK_Data_4
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> <i32 2, i32 9, i32 -12, i32 23>, <4 x i32> <i32 1, i32 18, i32 35, i32 52>)			%res = call <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32> <i32 2, i32 9, i32 -12, i32 23>, <4 x i32> <i32 1, i32 18, i32 35, i32 52>)
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.avx2.psrav.d(<4 x i32>, <4 x i32>) nounwind readnone

	define <8 x i32> @test_x86_avx2_psrav_d_256(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psrav_d_256(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrav_d_256:			; AVX2-LABEL: test_x86_avx2_psrav_d_256:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsravd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0xc1]			; AVX2-NEXT: vpsravd %ymm1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrav_d_256:			; AVX512VL-LABEL: test_x86_avx2_psrav_d_256:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpsravd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x46,0xc1]			; AVX512VL-NEXT: vpsravd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0xc1]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]			%res = call <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32> %a0, <8 x i32> %a1) ; <<8 x i32>> [#uses=1]
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}

	define <8 x i32> @test_x86_avx2_psrav_d_256_const(<8 x i32> %a0, <8 x i32> %a1) {			define <8 x i32> @test_x86_avx2_psrav_d_256_const(<8 x i32> %a0, <8 x i32> %a1) {
	; AVX2-LABEL: test_x86_avx2_psrav_d_256_const:			; AVX2-LABEL: test_x86_avx2_psrav_d_256_const:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]
	; AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]			; AVX2-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
	; AVX2-NEXT: ## fixup A - offset: 4, value: LCPI93_0, kind: FK_Data_4			; AVX2-NEXT: ## fixup A - offset: 4, value: LCPI93_0, kind: FK_Data_4
	; AVX2-NEXT: vpsravd LCPI93_1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]			; AVX2-NEXT: vpsravd LCPI93_1, %ymm0, %ymm0 ## encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]
	; AVX2-NEXT: ## fixup A - offset: 5, value: LCPI93_1, kind: FK_Data_4			; AVX2-NEXT: ## fixup A - offset: 5, value: LCPI93_1, kind: FK_Data_4
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_x86_avx2_psrav_d_256_const:			; AVX512VL-LABEL: test_x86_avx2_psrav_d_256_const:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]			; AVX512VL-NEXT: vmovdqa LCPI93_0, %ymm0 ## EVEX TO VEX Compression ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]
	; AVX512VL-NEXT: ## encoding: [0x62,0xf1,0x7d,0x28,0x6f,0x05,A,A,A,A]			; AVX512VL-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
	; AVX512VL-NEXT: ## fixup A - offset: 6, value: LCPI93_0, kind: FK_Data_4			; AVX512VL-NEXT: ## fixup A - offset: 4, value: LCPI93_0, kind: FK_Data_4
	; AVX512VL-NEXT: vpsravd LCPI93_1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x46,0x05,A,A,A,A]			; AVX512VL-NEXT: vpsravd LCPI93_1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]
	; AVX512VL-NEXT: ## fixup A - offset: 6, value: LCPI93_1, kind: FK_Data_4			; AVX512VL-NEXT: ## fixup A - offset: 5, value: LCPI93_1, kind: FK_Data_4
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32> <i32 2, i32 9, i32 -12, i32 23, i32 -26, i32 37, i32 -40, i32 51>, <8 x i32> <i32 1, i32 18, i32 35, i32 52, i32 69, i32 15, i32 32, i32 49>)			%res = call <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32> <i32 2, i32 9, i32 -12, i32 23, i32 -26, i32 37, i32 -40, i32 51>, <8 x i32> <i32 1, i32 18, i32 35, i32 52, i32 69, i32 15, i32 32, i32 49>)
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	declare <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32>, <8 x i32>) nounwind readnone			declare <8 x i32> @llvm.x86.avx2.psrav.d.256(<8 x i32>, <8 x i32>) nounwind readnone

	define <2 x double> @test_x86_avx2_gather_d_pd(<2 x double> %a0, i8* %a1, <4 x i32> %idx, <2 x double> %mask) {			define <2 x double> @test_x86_avx2_gather_d_pd(<2 x double> %a0, i8* %a1, <4 x i32> %idx, <2 x double> %mask) {
	; CHECK-LABEL: test_x86_avx2_gather_d_pd:			; CHECK-LABEL: test_x86_avx2_gather_d_pd:
	▲ Show 20 Lines • Show All 227 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vmovaps %ymm2, %ymm3 ## encoding: [0xc5,0xfc,0x28,0xda]			; AVX2-NEXT: vmovaps %ymm2, %ymm3 ## encoding: [0xc5,0xfc,0x28,0xda]
	; AVX2-NEXT: vgatherdps %ymm3, (%ecx,%ymm1,4), %ymm0 ## encoding: [0xc4,0xe2,0x65,0x92,0x04,0x89]			; AVX2-NEXT: vgatherdps %ymm3, (%ecx,%ymm1,4), %ymm0 ## encoding: [0xc4,0xe2,0x65,0x92,0x04,0x89]
	; AVX2-NEXT: vmovups %ymm2, (%eax) ## encoding: [0xc5,0xfc,0x11,0x10]			; AVX2-NEXT: vmovups %ymm2, (%eax) ## encoding: [0xc5,0xfc,0x11,0x10]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX512VL-LABEL: test_gather_mask:			; AVX512VL-LABEL: test_gather_mask:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX512VL-NEXT: vmovaps %ymm2, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xda]			; AVX512VL-NEXT: vmovaps %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xda]
	; AVX512VL-NEXT: vgatherdps %ymm3, (%eax,%ymm1,4), %ymm0 ## encoding: [0xc4,0xe2,0x65,0x92,0x04,0x88]			; AVX512VL-NEXT: vgatherdps %ymm3, (%eax,%ymm1,4), %ymm0 ## encoding: [0xc4,0xe2,0x65,0x92,0x04,0x88]
	; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08]			; AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08]
	; AVX512VL-NEXT: vmovups %ymm2, (%eax) ## encoding: [0x62,0xf1,0x7c,0x28,0x11,0x10]			; AVX512VL-NEXT: vmovups %ymm2, (%eax) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x11,0x10]
	; AVX512VL-NEXT: retl ## encoding: [0xc3]			; AVX512VL-NEXT: retl ## encoding: [0xc3]
	%a_i8 = bitcast float* %a to i8*			%a_i8 = bitcast float* %a to i8*
	%res = call <8 x float> @llvm.x86.avx2.gather.d.ps.256(<8 x float> %a0,			%res = call <8 x float> @llvm.x86.avx2.gather.d.ps.256(<8 x float> %a0,
	i8* %a_i8, <8 x i32> %idx, <8 x float> %mask, i8 4) ;			i8* %a_i8, <8 x i32> %idx, <8 x float> %mask, i8 4) ;

	;; for debugging, we'll just dump out the mask			;; for debugging, we'll just dump out the mask
	%out_ptr = bitcast float * %out to <8 x float> *			%out_ptr = bitcast float * %out to <8 x float> *
	store <8 x float> %mask, <8 x float> * %out_ptr, align 4			store <8 x float> %mask, <8 x float> * %out_ptr, align 4

	ret <8 x float> %res			ret <8 x float> %res
	}			}

llvm/trunk/test/CodeGen/X86/avx2-vbroadcast.ll

	Show First 20 Lines • Show All 1,134 Lines • ▼ Show 20 Lines
	; X32-AVX512VL-NEXT: subl $60, %esp			; X32-AVX512VL-NEXT: subl $60, %esp
	; X32-AVX512VL-NEXT: Lcfi0:			; X32-AVX512VL-NEXT: Lcfi0:
	; X32-AVX512VL-NEXT: .cfi_def_cfa_offset 64			; X32-AVX512VL-NEXT: .cfi_def_cfa_offset 64
	; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0			; X32-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; X32-AVX512VL-NEXT: vmovaps %xmm0, (%esp)			; X32-AVX512VL-NEXT: vmovaps %xmm0, (%esp)
	; X32-AVX512VL-NEXT: vpbroadcastb (%eax), %xmm1			; X32-AVX512VL-NEXT: vpbroadcastb (%eax), %xmm1
	; X32-AVX512VL-NEXT: vmovaps %xmm0, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovaps %xmm0, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: vmovdqa32 %xmm1, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovdqa %xmm1, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: addl $60, %esp			; X32-AVX512VL-NEXT: addl $60, %esp
	; X32-AVX512VL-NEXT: retl			; X32-AVX512VL-NEXT: retl
	;			;
	; X64-AVX512VL-LABEL: isel_crash_16b:			; X64-AVX512VL-LABEL: isel_crash_16b:
	; X64-AVX512VL: ## BB#0: ## %eintry			; X64-AVX512VL: ## BB#0: ## %eintry
	; X64-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0			; X64-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: movb (%rdi), %al			; X64-AVX512VL-NEXT: movb (%rdi), %al
	; X64-AVX512VL-NEXT: vmovd %eax, %xmm1			; X64-AVX512VL-NEXT: vmovd %eax, %xmm1
	; X64-AVX512VL-NEXT: vpbroadcastb %xmm1, %xmm1			; X64-AVX512VL-NEXT: vpbroadcastb %xmm1, %xmm1
	; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: vmovdqa32 %xmm1, -{{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovdqa %xmm1, -{{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: retq			; X64-AVX512VL-NEXT: retq
	eintry:			eintry:
	%__a.addr.i = alloca <2 x i64>, align 16			%__a.addr.i = alloca <2 x i64>, align 16
	%__b.addr.i = alloca <2 x i64>, align 16			%__b.addr.i = alloca <2 x i64>, align 16
	%vCr = alloca <2 x i64>, align 16			%vCr = alloca <2 x i64>, align 16
	store <2 x i64> zeroinitializer, <2 x i64>* %vCr, align 16			store <2 x i64> zeroinitializer, <2 x i64>* %vCr, align 16
	%tmp = load <2 x i64>, <2 x i64>* %vCr, align 16			%tmp = load <2 x i64>, <2 x i64>* %vCr, align 16
	%tmp2 = load i8, i8* %cV_R.addr, align 4			%tmp2 = load i8, i8* %cV_R.addr, align 4
	▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
	; X32-AVX512VL-NEXT: .cfi_def_cfa_register %ebp			; X32-AVX512VL-NEXT: .cfi_def_cfa_register %ebp
	; X32-AVX512VL-NEXT: andl $-32, %esp			; X32-AVX512VL-NEXT: andl $-32, %esp
	; X32-AVX512VL-NEXT: subl $128, %esp			; X32-AVX512VL-NEXT: subl $128, %esp
	; X32-AVX512VL-NEXT: movl 8(%ebp), %eax			; X32-AVX512VL-NEXT: movl 8(%ebp), %eax
	; X32-AVX512VL-NEXT: vxorps %ymm0, %ymm0, %ymm0			; X32-AVX512VL-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; X32-AVX512VL-NEXT: vmovaps %ymm0, (%esp)			; X32-AVX512VL-NEXT: vmovaps %ymm0, (%esp)
	; X32-AVX512VL-NEXT: vpbroadcastb (%eax), %ymm1			; X32-AVX512VL-NEXT: vpbroadcastb (%eax), %ymm1
	; X32-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: vmovdqa32 %ymm1, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovdqa %ymm1, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: movl %ebp, %esp			; X32-AVX512VL-NEXT: movl %ebp, %esp
	; X32-AVX512VL-NEXT: popl %ebp			; X32-AVX512VL-NEXT: popl %ebp
	; X32-AVX512VL-NEXT: retl			; X32-AVX512VL-NEXT: retl
	;			;
	; X64-AVX512VL-LABEL: isel_crash_32b:			; X64-AVX512VL-LABEL: isel_crash_32b:
	; X64-AVX512VL: ## BB#0: ## %eintry			; X64-AVX512VL: ## BB#0: ## %eintry
	; X64-AVX512VL-NEXT: pushq %rbp			; X64-AVX512VL-NEXT: pushq %rbp
	; X64-AVX512VL-NEXT: Lcfi0:			; X64-AVX512VL-NEXT: Lcfi0:
	; X64-AVX512VL-NEXT: .cfi_def_cfa_offset 16			; X64-AVX512VL-NEXT: .cfi_def_cfa_offset 16
	; X64-AVX512VL-NEXT: Lcfi1:			; X64-AVX512VL-NEXT: Lcfi1:
	; X64-AVX512VL-NEXT: .cfi_offset %rbp, -16			; X64-AVX512VL-NEXT: .cfi_offset %rbp, -16
	; X64-AVX512VL-NEXT: movq %rsp, %rbp			; X64-AVX512VL-NEXT: movq %rsp, %rbp
	; X64-AVX512VL-NEXT: Lcfi2:			; X64-AVX512VL-NEXT: Lcfi2:
	; X64-AVX512VL-NEXT: .cfi_def_cfa_register %rbp			; X64-AVX512VL-NEXT: .cfi_def_cfa_register %rbp
	; X64-AVX512VL-NEXT: andq $-32, %rsp			; X64-AVX512VL-NEXT: andq $-32, %rsp
	; X64-AVX512VL-NEXT: subq $128, %rsp			; X64-AVX512VL-NEXT: subq $128, %rsp
	; X64-AVX512VL-NEXT: vxorps %ymm0, %ymm0, %ymm0			; X64-AVX512VL-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; X64-AVX512VL-NEXT: vmovaps %ymm0, (%rsp)			; X64-AVX512VL-NEXT: vmovaps %ymm0, (%rsp)
	; X64-AVX512VL-NEXT: movb (%rdi), %al			; X64-AVX512VL-NEXT: movb (%rdi), %al
	; X64-AVX512VL-NEXT: vmovd %eax, %xmm1			; X64-AVX512VL-NEXT: vmovd %eax, %xmm1
	; X64-AVX512VL-NEXT: vpbroadcastb %xmm1, %ymm1			; X64-AVX512VL-NEXT: vpbroadcastb %xmm1, %ymm1
	; X64-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: vmovdqa32 %ymm1, {{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovdqa %ymm1, {{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: movq %rbp, %rsp			; X64-AVX512VL-NEXT: movq %rbp, %rsp
	; X64-AVX512VL-NEXT: popq %rbp			; X64-AVX512VL-NEXT: popq %rbp
	; X64-AVX512VL-NEXT: retq			; X64-AVX512VL-NEXT: retq
	eintry:			eintry:
	%__a.addr.i = alloca <4 x i64>, align 16			%__a.addr.i = alloca <4 x i64>, align 16
	%__b.addr.i = alloca <4 x i64>, align 16			%__b.addr.i = alloca <4 x i64>, align 16
	%vCr = alloca <4 x i64>, align 16			%vCr = alloca <4 x i64>, align 16
	store <4 x i64> zeroinitializer, <4 x i64>* %vCr, align 16			store <4 x i64> zeroinitializer, <4 x i64>* %vCr, align 16
	Show All 38 Lines
	; X32-AVX512VL-NEXT: subl $60, %esp			; X32-AVX512VL-NEXT: subl $60, %esp
	; X32-AVX512VL-NEXT: Lcfi4:			; X32-AVX512VL-NEXT: Lcfi4:
	; X32-AVX512VL-NEXT: .cfi_def_cfa_offset 64			; X32-AVX512VL-NEXT: .cfi_def_cfa_offset 64
	; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0			; X32-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; X32-AVX512VL-NEXT: vmovaps %xmm0, (%esp)			; X32-AVX512VL-NEXT: vmovaps %xmm0, (%esp)
	; X32-AVX512VL-NEXT: vpbroadcastw (%eax), %xmm1			; X32-AVX512VL-NEXT: vpbroadcastw (%eax), %xmm1
	; X32-AVX512VL-NEXT: vmovaps %xmm0, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovaps %xmm0, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: vmovdqa32 %xmm1, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovdqa %xmm1, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: addl $60, %esp			; X32-AVX512VL-NEXT: addl $60, %esp
	; X32-AVX512VL-NEXT: retl			; X32-AVX512VL-NEXT: retl
	;			;
	; X64-AVX512VL-LABEL: isel_crash_8w:			; X64-AVX512VL-LABEL: isel_crash_8w:
	; X64-AVX512VL: ## BB#0: ## %entry			; X64-AVX512VL: ## BB#0: ## %entry
	; X64-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0			; X64-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: movw (%rdi), %ax			; X64-AVX512VL-NEXT: movw (%rdi), %ax
	; X64-AVX512VL-NEXT: vmovd %eax, %xmm1			; X64-AVX512VL-NEXT: vmovd %eax, %xmm1
	; X64-AVX512VL-NEXT: vpbroadcastw %xmm1, %xmm1			; X64-AVX512VL-NEXT: vpbroadcastw %xmm1, %xmm1
	; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: vmovdqa32 %xmm1, -{{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovdqa %xmm1, -{{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: retq			; X64-AVX512VL-NEXT: retq
	entry:			entry:
	%__a.addr.i = alloca <2 x i64>, align 16			%__a.addr.i = alloca <2 x i64>, align 16
	%__b.addr.i = alloca <2 x i64>, align 16			%__b.addr.i = alloca <2 x i64>, align 16
	%vCr = alloca <2 x i64>, align 16			%vCr = alloca <2 x i64>, align 16
	store <2 x i64> zeroinitializer, <2 x i64>* %vCr, align 16			store <2 x i64> zeroinitializer, <2 x i64>* %vCr, align 16
	%tmp = load <2 x i64>, <2 x i64>* %vCr, align 16			%tmp = load <2 x i64>, <2 x i64>* %vCr, align 16
	%tmp2 = load i16, i16* %cV_R.addr, align 4			%tmp2 = load i16, i16* %cV_R.addr, align 4
	▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
	; X32-AVX512VL-NEXT: .cfi_def_cfa_register %ebp			; X32-AVX512VL-NEXT: .cfi_def_cfa_register %ebp
	; X32-AVX512VL-NEXT: andl $-32, %esp			; X32-AVX512VL-NEXT: andl $-32, %esp
	; X32-AVX512VL-NEXT: subl $128, %esp			; X32-AVX512VL-NEXT: subl $128, %esp
	; X32-AVX512VL-NEXT: movl 8(%ebp), %eax			; X32-AVX512VL-NEXT: movl 8(%ebp), %eax
	; X32-AVX512VL-NEXT: vxorps %ymm0, %ymm0, %ymm0			; X32-AVX512VL-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; X32-AVX512VL-NEXT: vmovaps %ymm0, (%esp)			; X32-AVX512VL-NEXT: vmovaps %ymm0, (%esp)
	; X32-AVX512VL-NEXT: vpbroadcastw (%eax), %ymm1			; X32-AVX512VL-NEXT: vpbroadcastw (%eax), %ymm1
	; X32-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: vmovdqa32 %ymm1, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovdqa %ymm1, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: movl %ebp, %esp			; X32-AVX512VL-NEXT: movl %ebp, %esp
	; X32-AVX512VL-NEXT: popl %ebp			; X32-AVX512VL-NEXT: popl %ebp
	; X32-AVX512VL-NEXT: retl			; X32-AVX512VL-NEXT: retl
	;			;
	; X64-AVX512VL-LABEL: isel_crash_16w:			; X64-AVX512VL-LABEL: isel_crash_16w:
	; X64-AVX512VL: ## BB#0: ## %eintry			; X64-AVX512VL: ## BB#0: ## %eintry
	; X64-AVX512VL-NEXT: pushq %rbp			; X64-AVX512VL-NEXT: pushq %rbp
	; X64-AVX512VL-NEXT: Lcfi3:			; X64-AVX512VL-NEXT: Lcfi3:
	; X64-AVX512VL-NEXT: .cfi_def_cfa_offset 16			; X64-AVX512VL-NEXT: .cfi_def_cfa_offset 16
	; X64-AVX512VL-NEXT: Lcfi4:			; X64-AVX512VL-NEXT: Lcfi4:
	; X64-AVX512VL-NEXT: .cfi_offset %rbp, -16			; X64-AVX512VL-NEXT: .cfi_offset %rbp, -16
	; X64-AVX512VL-NEXT: movq %rsp, %rbp			; X64-AVX512VL-NEXT: movq %rsp, %rbp
	; X64-AVX512VL-NEXT: Lcfi5:			; X64-AVX512VL-NEXT: Lcfi5:
	; X64-AVX512VL-NEXT: .cfi_def_cfa_register %rbp			; X64-AVX512VL-NEXT: .cfi_def_cfa_register %rbp
	; X64-AVX512VL-NEXT: andq $-32, %rsp			; X64-AVX512VL-NEXT: andq $-32, %rsp
	; X64-AVX512VL-NEXT: subq $128, %rsp			; X64-AVX512VL-NEXT: subq $128, %rsp
	; X64-AVX512VL-NEXT: vxorps %ymm0, %ymm0, %ymm0			; X64-AVX512VL-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; X64-AVX512VL-NEXT: vmovaps %ymm0, (%rsp)			; X64-AVX512VL-NEXT: vmovaps %ymm0, (%rsp)
	; X64-AVX512VL-NEXT: movw (%rdi), %ax			; X64-AVX512VL-NEXT: movw (%rdi), %ax
	; X64-AVX512VL-NEXT: vmovd %eax, %xmm1			; X64-AVX512VL-NEXT: vmovd %eax, %xmm1
	; X64-AVX512VL-NEXT: vpbroadcastw %xmm1, %ymm1			; X64-AVX512VL-NEXT: vpbroadcastw %xmm1, %ymm1
	; X64-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: vmovdqa32 %ymm1, {{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovdqa %ymm1, {{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: movq %rbp, %rsp			; X64-AVX512VL-NEXT: movq %rbp, %rsp
	; X64-AVX512VL-NEXT: popq %rbp			; X64-AVX512VL-NEXT: popq %rbp
	; X64-AVX512VL-NEXT: retq			; X64-AVX512VL-NEXT: retq
	eintry:			eintry:
	%__a.addr.i = alloca <4 x i64>, align 16			%__a.addr.i = alloca <4 x i64>, align 16
	%__b.addr.i = alloca <4 x i64>, align 16			%__b.addr.i = alloca <4 x i64>, align 16
	%vCr = alloca <4 x i64>, align 16			%vCr = alloca <4 x i64>, align 16
	store <4 x i64> zeroinitializer, <4 x i64>* %vCr, align 16			store <4 x i64> zeroinitializer, <4 x i64>* %vCr, align 16
	▲ Show 20 Lines • Show All 204 Lines • ▼ Show 20 Lines
	; X32-AVX512VL-NEXT: vmovaps %xmm0, (%esp)			; X32-AVX512VL-NEXT: vmovaps %xmm0, (%esp)
	; X32-AVX512VL-NEXT: movl (%eax), %ecx			; X32-AVX512VL-NEXT: movl (%eax), %ecx
	; X32-AVX512VL-NEXT: movl 4(%eax), %eax			; X32-AVX512VL-NEXT: movl 4(%eax), %eax
	; X32-AVX512VL-NEXT: vmovd %ecx, %xmm1			; X32-AVX512VL-NEXT: vmovd %ecx, %xmm1
	; X32-AVX512VL-NEXT: vpinsrd $1, %eax, %xmm1, %xmm1			; X32-AVX512VL-NEXT: vpinsrd $1, %eax, %xmm1, %xmm1
	; X32-AVX512VL-NEXT: vpinsrd $2, %ecx, %xmm1, %xmm1			; X32-AVX512VL-NEXT: vpinsrd $2, %ecx, %xmm1, %xmm1
	; X32-AVX512VL-NEXT: vpinsrd $3, %eax, %xmm1, %xmm1			; X32-AVX512VL-NEXT: vpinsrd $3, %eax, %xmm1, %xmm1
	; X32-AVX512VL-NEXT: vmovaps %xmm0, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovaps %xmm0, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: vmovdqa32 %xmm1, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovdqa %xmm1, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: addl $60, %esp			; X32-AVX512VL-NEXT: addl $60, %esp
	; X32-AVX512VL-NEXT: retl			; X32-AVX512VL-NEXT: retl
	;			;
	; X64-AVX512VL-LABEL: isel_crash_2q:			; X64-AVX512VL-LABEL: isel_crash_2q:
	; X64-AVX512VL: ## BB#0: ## %entry			; X64-AVX512VL: ## BB#0: ## %entry
	; X64-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0			; X64-AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)			; X64-AVX512VL-NEXT: vmovaps %xmm0, -{{[0-9]+}}(%rsp)
	; X64-AVX512VL-NEXT: movq (%rdi), %rax			; X64-AVX512VL-NEXT: movq (%rdi), %rax
	▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	; X32-AVX512VL-NEXT: movl (%eax), %ecx			; X32-AVX512VL-NEXT: movl (%eax), %ecx
	; X32-AVX512VL-NEXT: movl 4(%eax), %eax			; X32-AVX512VL-NEXT: movl 4(%eax), %eax
	; X32-AVX512VL-NEXT: vmovd %ecx, %xmm1			; X32-AVX512VL-NEXT: vmovd %ecx, %xmm1
	; X32-AVX512VL-NEXT: vpinsrd $1, %eax, %xmm1, %xmm1			; X32-AVX512VL-NEXT: vpinsrd $1, %eax, %xmm1, %xmm1
	; X32-AVX512VL-NEXT: vpinsrd $2, %ecx, %xmm1, %xmm1			; X32-AVX512VL-NEXT: vpinsrd $2, %ecx, %xmm1, %xmm1
	; X32-AVX512VL-NEXT: vpinsrd $3, %eax, %xmm1, %xmm1			; X32-AVX512VL-NEXT: vpinsrd $3, %eax, %xmm1, %xmm1
	; X32-AVX512VL-NEXT: vinserti32x4 $1, %xmm1, %ymm1, %ymm1			; X32-AVX512VL-NEXT: vinserti32x4 $1, %xmm1, %ymm1, %ymm1
	; X32-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovaps %ymm0, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: vmovdqa32 %ymm1, {{[0-9]+}}(%esp)			; X32-AVX512VL-NEXT: vmovdqa %ymm1, {{[0-9]+}}(%esp)
	; X32-AVX512VL-NEXT: movl %ebp, %esp			; X32-AVX512VL-NEXT: movl %ebp, %esp
	; X32-AVX512VL-NEXT: popl %ebp			; X32-AVX512VL-NEXT: popl %ebp
	; X32-AVX512VL-NEXT: retl			; X32-AVX512VL-NEXT: retl
	;			;
	; X64-AVX512VL-LABEL: isel_crash_4q:			; X64-AVX512VL-LABEL: isel_crash_4q:
	; X64-AVX512VL: ## BB#0: ## %eintry			; X64-AVX512VL: ## BB#0: ## %eintry
	; X64-AVX512VL-NEXT: pushq %rbp			; X64-AVX512VL-NEXT: pushq %rbp
	; X64-AVX512VL-NEXT: Lcfi9:			; X64-AVX512VL-NEXT: Lcfi9:
	Show All 30 Lines

llvm/trunk/test/CodeGen/X86/avx512-arith.ll

	Show First 20 Lines • Show All 718 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>			; AVX512F-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>
	; AVX512F-NEXT: vpxor %ymm4, %ymm4, %ymm4			; AVX512F-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; AVX512F-NEXT: vpcmpneqd %zmm4, %zmm3, %k1			; AVX512F-NEXT: vpcmpneqd %zmm4, %zmm3, %k1
	; AVX512F-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}			; AVX512F-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: test_mask_vminpd:			; AVX512VL-LABEL: test_mask_vminpd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpxord %ymm4, %ymm4, %ymm4			; AVX512VL-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; AVX512VL-NEXT: vpcmpneqd %ymm4, %ymm3, %k1			; AVX512VL-NEXT: vpcmpneqd %ymm4, %ymm3, %k1
	; AVX512VL-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}			; AVX512VL-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512BW-LABEL: test_mask_vminpd:			; AVX512BW-LABEL: test_mask_vminpd:
	; AVX512BW: ## BB#0:			; AVX512BW: ## BB#0:
	; AVX512BW-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>			; AVX512BW-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>
	; AVX512BW-NEXT: vpxor %ymm4, %ymm4, %ymm4			; AVX512BW-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; AVX512BW-NEXT: vpcmpneqd %zmm4, %zmm3, %k1			; AVX512BW-NEXT: vpcmpneqd %zmm4, %zmm3, %k1
	; AVX512BW-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}			; AVX512BW-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}
	; AVX512BW-NEXT: retq			; AVX512BW-NEXT: retq
	;			;
	; AVX512DQ-LABEL: test_mask_vminpd:			; AVX512DQ-LABEL: test_mask_vminpd:
	; AVX512DQ: ## BB#0:			; AVX512DQ: ## BB#0:
	; AVX512DQ-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>			; AVX512DQ-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>
	; AVX512DQ-NEXT: vpxor %ymm4, %ymm4, %ymm4			; AVX512DQ-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; AVX512DQ-NEXT: vpcmpneqd %zmm4, %zmm3, %k1			; AVX512DQ-NEXT: vpcmpneqd %zmm4, %zmm3, %k1
	; AVX512DQ-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}			; AVX512DQ-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}
	; AVX512DQ-NEXT: retq			; AVX512DQ-NEXT: retq
	;			;
	; SKX-LABEL: test_mask_vminpd:			; SKX-LABEL: test_mask_vminpd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %ymm4, %ymm4, %ymm4			; SKX-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; SKX-NEXT: vpcmpneqd %ymm4, %ymm3, %k1			; SKX-NEXT: vpcmpneqd %ymm4, %ymm3, %k1
	; SKX-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}			; SKX-NEXT: vminpd %zmm2, %zmm1, %zmm0 {%k1}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	<8 x double> %j, <8 x i32> %mask1)			<8 x double> %j, <8 x i32> %mask1)
	nounwind readnone {			nounwind readnone {
	%mask = icmp ne <8 x i32> %mask1, zeroinitializer			%mask = icmp ne <8 x i32> %mask1, zeroinitializer
	%cmp_res = fcmp olt <8 x double> %i, %j			%cmp_res = fcmp olt <8 x double> %i, %j
	%min = select <8 x i1> %cmp_res, <8 x double> %i, <8 x double> %j			%min = select <8 x i1> %cmp_res, <8 x double> %i, <8 x double> %j
	Show All 23 Lines
	; AVX512F-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>			; AVX512F-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>
	; AVX512F-NEXT: vpxor %ymm4, %ymm4, %ymm4			; AVX512F-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; AVX512F-NEXT: vpcmpneqd %zmm4, %zmm3, %k1			; AVX512F-NEXT: vpcmpneqd %zmm4, %zmm3, %k1
	; AVX512F-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}			; AVX512F-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: test_mask_vmaxpd:			; AVX512VL-LABEL: test_mask_vmaxpd:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpxord %ymm4, %ymm4, %ymm4			; AVX512VL-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; AVX512VL-NEXT: vpcmpneqd %ymm4, %ymm3, %k1			; AVX512VL-NEXT: vpcmpneqd %ymm4, %ymm3, %k1
	; AVX512VL-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}			; AVX512VL-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512BW-LABEL: test_mask_vmaxpd:			; AVX512BW-LABEL: test_mask_vmaxpd:
	; AVX512BW: ## BB#0:			; AVX512BW: ## BB#0:
	; AVX512BW-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>			; AVX512BW-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>
	; AVX512BW-NEXT: vpxor %ymm4, %ymm4, %ymm4			; AVX512BW-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; AVX512BW-NEXT: vpcmpneqd %zmm4, %zmm3, %k1			; AVX512BW-NEXT: vpcmpneqd %zmm4, %zmm3, %k1
	; AVX512BW-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}			; AVX512BW-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}
	; AVX512BW-NEXT: retq			; AVX512BW-NEXT: retq
	;			;
	; AVX512DQ-LABEL: test_mask_vmaxpd:			; AVX512DQ-LABEL: test_mask_vmaxpd:
	; AVX512DQ: ## BB#0:			; AVX512DQ: ## BB#0:
	; AVX512DQ-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>			; AVX512DQ-NEXT: ## kill: %YMM3<def> %YMM3<kill> %ZMM3<def>
	; AVX512DQ-NEXT: vpxor %ymm4, %ymm4, %ymm4			; AVX512DQ-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; AVX512DQ-NEXT: vpcmpneqd %zmm4, %zmm3, %k1			; AVX512DQ-NEXT: vpcmpneqd %zmm4, %zmm3, %k1
	; AVX512DQ-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}			; AVX512DQ-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}
	; AVX512DQ-NEXT: retq			; AVX512DQ-NEXT: retq
	;			;
	; SKX-LABEL: test_mask_vmaxpd:			; SKX-LABEL: test_mask_vmaxpd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %ymm4, %ymm4, %ymm4			; SKX-NEXT: vpxor %ymm4, %ymm4, %ymm4
	; SKX-NEXT: vpcmpneqd %ymm4, %ymm3, %k1			; SKX-NEXT: vpcmpneqd %ymm4, %ymm3, %k1
	; SKX-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}			; SKX-NEXT: vmaxpd %zmm2, %zmm1, %zmm0 {%k1}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	<8 x double> %j, <8 x i32> %mask1)			<8 x double> %j, <8 x i32> %mask1)
	nounwind readnone {			nounwind readnone {
	%mask = icmp ne <8 x i32> %mask1, zeroinitializer			%mask = icmp ne <8 x i32> %mask1, zeroinitializer
	%cmp_res = fcmp ogt <8 x double> %i, %j			%cmp_res = fcmp ogt <8 x double> %i, %j
	%max = select <8 x i1> %cmp_res, <8 x double> %i, <8 x double> %j			%max = select <8 x i1> %cmp_res, <8 x double> %i, <8 x double> %j
	▲ Show 20 Lines • Show All 267 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-cvt.ll

	Show First 20 Lines • Show All 1,031 Lines • ▼ Show 20 Lines
	; KNL-NEXT: vpbroadcastq {{.*}}(%rip), %zmm0 {%k1} {z}			; KNL-NEXT: vpbroadcastq {{.*}}(%rip), %zmm0 {%k1} {z}
	; KNL-NEXT: vpmovqd %zmm0, %ymm0			; KNL-NEXT: vpmovqd %zmm0, %ymm0
	; KNL-NEXT: vcvtudq2ps %zmm0, %zmm0			; KNL-NEXT: vcvtudq2ps %zmm0, %zmm0
	; KNL-NEXT: ## kill: %YMM0<def> %YMM0<kill> %ZMM0<kill>			; KNL-NEXT: ## kill: %YMM0<def> %YMM0<kill> %ZMM0<kill>
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: uitofp_8i1_float:			; SKX-LABEL: uitofp_8i1_float:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %ymm1, %ymm1, %ymm1			; SKX-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; SKX-NEXT: vpcmpgtd %ymm0, %ymm1, %k1			; SKX-NEXT: vpcmpgtd %ymm0, %ymm1, %k1
	; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %ymm0 {%k1} {z}			; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %ymm0 {%k1} {z}
	; SKX-NEXT: vcvtudq2ps %ymm0, %ymm0			; SKX-NEXT: vcvtudq2ps %ymm0, %ymm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp slt <8 x i32> %a, zeroinitializer			%mask = icmp slt <8 x i32> %a, zeroinitializer
	%1 = uitofp <8 x i1> %mask to <8 x float>			%1 = uitofp <8 x i1> %mask to <8 x float>
	ret <8 x float> %1			ret <8 x float> %1
	}			}

	define <8 x double> @uitofp_8i1_double(<8 x i32> %a) {			define <8 x double> @uitofp_8i1_double(<8 x i32> %a) {
	; KNL-LABEL: uitofp_8i1_double:			; KNL-LABEL: uitofp_8i1_double:
	; KNL: ## BB#0:			; KNL: ## BB#0:
	; KNL-NEXT: ## kill: %YMM0<def> %YMM0<kill> %ZMM0<def>			; KNL-NEXT: ## kill: %YMM0<def> %YMM0<kill> %ZMM0<def>
	; KNL-NEXT: vpxor %ymm1, %ymm1, %ymm1			; KNL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; KNL-NEXT: vpcmpgtd %zmm0, %zmm1, %k1			; KNL-NEXT: vpcmpgtd %zmm0, %zmm1, %k1
	; KNL-NEXT: vpbroadcastq {{.*}}(%rip), %zmm0 {%k1} {z}			; KNL-NEXT: vpbroadcastq {{.*}}(%rip), %zmm0 {%k1} {z}
	; KNL-NEXT: vpmovqd %zmm0, %ymm0			; KNL-NEXT: vpmovqd %zmm0, %ymm0
	; KNL-NEXT: vcvtudq2pd %ymm0, %zmm0			; KNL-NEXT: vcvtudq2pd %ymm0, %zmm0
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: uitofp_8i1_double:			; SKX-LABEL: uitofp_8i1_double:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %ymm1, %ymm1, %ymm1			; SKX-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; SKX-NEXT: vpcmpgtd %ymm0, %ymm1, %k1			; SKX-NEXT: vpcmpgtd %ymm0, %ymm1, %k1
	; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %ymm0 {%k1} {z}			; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %ymm0 {%k1} {z}
	; SKX-NEXT: vcvtudq2pd %ymm0, %zmm0			; SKX-NEXT: vcvtudq2pd %ymm0, %zmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp slt <8 x i32> %a, zeroinitializer			%mask = icmp slt <8 x i32> %a, zeroinitializer
	%1 = uitofp <8 x i1> %mask to <8 x double>			%1 = uitofp <8 x i1> %mask to <8 x double>
	ret <8 x double> %1			ret <8 x double> %1
	}			}

	define <4 x float> @uitofp_4i1_float(<4 x i32> %a) {			define <4 x float> @uitofp_4i1_float(<4 x i32> %a) {
	; KNL-LABEL: uitofp_4i1_float:			; KNL-LABEL: uitofp_4i1_float:
	; KNL: ## BB#0:			; KNL: ## BB#0:
	; KNL-NEXT: vpxor %xmm1, %xmm1, %xmm1			; KNL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; KNL-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0			; KNL-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
	; KNL-NEXT: vpbroadcastd {{.*}}(%rip), %xmm1			; KNL-NEXT: vpbroadcastd {{.*}}(%rip), %xmm1
	; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0			; KNL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: uitofp_4i1_float:			; SKX-LABEL: uitofp_4i1_float:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm1, %xmm1, %xmm1			; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; SKX-NEXT: vpcmpgtd %xmm0, %xmm1, %k1			; SKX-NEXT: vpcmpgtd %xmm0, %xmm1, %k1
	; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %xmm0 {%k1} {z}			; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %xmm0 {%k1} {z}
	; SKX-NEXT: vcvtudq2ps %xmm0, %xmm0			; SKX-NEXT: vcvtudq2ps %xmm0, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp slt <4 x i32> %a, zeroinitializer			%mask = icmp slt <4 x i32> %a, zeroinitializer
	%1 = uitofp <4 x i1> %mask to <4 x float>			%1 = uitofp <4 x i1> %mask to <4 x float>
	ret <4 x float> %1			ret <4 x float> %1
	}			}

	define <4 x double> @uitofp_4i1_double(<4 x i32> %a) {			define <4 x double> @uitofp_4i1_double(<4 x i32> %a) {
	; KNL-LABEL: uitofp_4i1_double:			; KNL-LABEL: uitofp_4i1_double:
	; KNL: ## BB#0:			; KNL: ## BB#0:
	; KNL-NEXT: vpxor %xmm1, %xmm1, %xmm1			; KNL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; KNL-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0			; KNL-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
	; KNL-NEXT: vpsrld $31, %xmm0, %xmm0			; KNL-NEXT: vpsrld $31, %xmm0, %xmm0
	; KNL-NEXT: vcvtdq2pd %xmm0, %ymm0			; KNL-NEXT: vcvtdq2pd %xmm0, %ymm0
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: uitofp_4i1_double:			; SKX-LABEL: uitofp_4i1_double:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm1, %xmm1, %xmm1			; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; SKX-NEXT: vpcmpgtd %xmm0, %xmm1, %k1			; SKX-NEXT: vpcmpgtd %xmm0, %xmm1, %k1
	; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %xmm0 {%k1} {z}			; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %xmm0 {%k1} {z}
	; SKX-NEXT: vcvtudq2pd %xmm0, %ymm0			; SKX-NEXT: vcvtudq2pd %xmm0, %ymm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp slt <4 x i32> %a, zeroinitializer			%mask = icmp slt <4 x i32> %a, zeroinitializer
	%1 = uitofp <4 x i1> %mask to <4 x double>			%1 = uitofp <4 x i1> %mask to <4 x double>
	ret <4 x double> %1			ret <4 x double> %1
	}			}
	Show All 12 Lines
	; KNL-NEXT: vmovq %xmm0, %rax			; KNL-NEXT: vmovq %xmm0, %rax
	; KNL-NEXT: andl $1, %eax			; KNL-NEXT: andl $1, %eax
	; KNL-NEXT: vcvtsi2ssl %eax, %xmm2, %xmm0			; KNL-NEXT: vcvtsi2ssl %eax, %xmm2, %xmm0
	; KNL-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[2,3]			; KNL-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[2,3]
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: uitofp_2i1_float:			; SKX-LABEL: uitofp_2i1_float:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm1, %xmm1, %xmm1			; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]			; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]
	; SKX-NEXT: vpcmpltuq %xmm1, %xmm0, %k1			; SKX-NEXT: vpcmpltuq %xmm1, %xmm0, %k1
	; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %xmm0 {%k1} {z}			; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %xmm0 {%k1} {z}
	; SKX-NEXT: vcvtudq2ps %xmm0, %xmm0			; SKX-NEXT: vcvtudq2ps %xmm0, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp ult <2 x i32> %a, zeroinitializer			%mask = icmp ult <2 x i32> %a, zeroinitializer
	%1 = uitofp <2 x i1> %mask to <2 x float>			%1 = uitofp <2 x i1> %mask to <2 x float>
	ret <2 x float> %1			ret <2 x float> %1
	}			}

	define <2 x double> @uitofp_2i1_double(<2 x i32> %a) {			define <2 x double> @uitofp_2i1_double(<2 x i32> %a) {
	; KNL-LABEL: uitofp_2i1_double:			; KNL-LABEL: uitofp_2i1_double:
	; KNL: ## BB#0:			; KNL: ## BB#0:
	; KNL-NEXT: vpxor %xmm1, %xmm1, %xmm1			; KNL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; KNL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]			; KNL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]
	; KNL-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]			; KNL-NEXT: vmovdqa {{.*#+}} xmm1 = [9223372036854775808,9223372036854775808]
	; KNL-NEXT: vpxor %xmm1, %xmm0, %xmm0			; KNL-NEXT: vpxor %xmm1, %xmm0, %xmm0
	; KNL-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0			; KNL-NEXT: vpcmpgtq %xmm0, %xmm1, %xmm0
	; KNL-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0			; KNL-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: uitofp_2i1_double:			; SKX-LABEL: uitofp_2i1_double:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm1, %xmm1, %xmm1			; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]			; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]
	; SKX-NEXT: vpcmpltuq %xmm1, %xmm0, %k1			; SKX-NEXT: vpcmpltuq %xmm1, %xmm0, %k1
	; SKX-NEXT: vmovdqa64 {{.*}}(%rip), %xmm0 {%k1} {z}			; SKX-NEXT: vmovdqa64 {{.*}}(%rip), %xmm0 {%k1} {z}
	; SKX-NEXT: vcvtuqq2pd %xmm0, %xmm0			; SKX-NEXT: vcvtuqq2pd %xmm0, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp ult <2 x i32> %a, zeroinitializer			%mask = icmp ult <2 x i32> %a, zeroinitializer
	%1 = uitofp <2 x i1> %mask to <2 x double>			%1 = uitofp <2 x i1> %mask to <2 x double>
	ret <2 x double> %1			ret <2 x double> %1
	}			}

llvm/trunk/test/CodeGen/X86/avx512-ext.ll

	Show First 20 Lines • Show All 1,990 Lines • ▼ Show 20 Lines
	; KNL-NEXT: vpand %xmm2, %xmm1, %xmm1			; KNL-NEXT: vpand %xmm2, %xmm1, %xmm1
	; KNL-NEXT: vpand %xmm2, %xmm0, %xmm0			; KNL-NEXT: vpand %xmm2, %xmm0, %xmm0
	; KNL-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0			; KNL-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
	; KNL-NEXT: vpsrld $31, %xmm0, %xmm0			; KNL-NEXT: vpsrld $31, %xmm0, %xmm0
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: zext_4xi1_to_4x32:			; SKX-LABEL: zext_4xi1_to_4x32:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vmovdqa64 {{.*#+}} xmm2 = [255,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0]			; SKX-NEXT: vmovdqa {{.*#+}} xmm2 = [255,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0]
	; SKX-NEXT: vpandq %xmm2, %xmm1, %xmm1			; SKX-NEXT: vpand %xmm2, %xmm1, %xmm1
	; SKX-NEXT: vpandq %xmm2, %xmm0, %xmm0			; SKX-NEXT: vpand %xmm2, %xmm0, %xmm0
	; SKX-NEXT: vpcmpeqd %xmm1, %xmm0, %k1			; SKX-NEXT: vpcmpeqd %xmm1, %xmm0, %k1
	; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %xmm0 {%k1} {z}			; SKX-NEXT: vpbroadcastd {{.*}}(%rip), %xmm0 {%k1} {z}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <4 x i8> %x, %y			%mask = icmp eq <4 x i8> %x, %y
	%1 = zext <4 x i1> %mask to <4 x i32>			%1 = zext <4 x i1> %mask to <4 x i32>
	ret <4 x i32> %1			ret <4 x i32> %1
	}			}

	define <2 x i64> @zext_2xi1_to_2xi64(<2 x i8> %x, <2 x i8> %y) #0 {			define <2 x i64> @zext_2xi1_to_2xi64(<2 x i8> %x, <2 x i8> %y) #0 {
	; KNL-LABEL: zext_2xi1_to_2xi64:			; KNL-LABEL: zext_2xi1_to_2xi64:
	; KNL: ## BB#0:			; KNL: ## BB#0:
	; KNL-NEXT: vmovdqa {{.*#+}} xmm2 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0]			; KNL-NEXT: vmovdqa {{.*#+}} xmm2 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0]
	; KNL-NEXT: vpand %xmm2, %xmm1, %xmm1			; KNL-NEXT: vpand %xmm2, %xmm1, %xmm1
	; KNL-NEXT: vpand %xmm2, %xmm0, %xmm0			; KNL-NEXT: vpand %xmm2, %xmm0, %xmm0
	; KNL-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0			; KNL-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
	; KNL-NEXT: vpsrlq $63, %xmm0, %xmm0			; KNL-NEXT: vpsrlq $63, %xmm0, %xmm0
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: zext_2xi1_to_2xi64:			; SKX-LABEL: zext_2xi1_to_2xi64:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vmovdqa64 {{.*#+}} xmm2 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0]			; SKX-NEXT: vmovdqa {{.*#+}} xmm2 = [255,0,0,0,0,0,0,0,255,0,0,0,0,0,0,0]
	; SKX-NEXT: vpandq %xmm2, %xmm1, %xmm1			; SKX-NEXT: vpand %xmm2, %xmm1, %xmm1
	; SKX-NEXT: vpandq %xmm2, %xmm0, %xmm0			; SKX-NEXT: vpand %xmm2, %xmm0, %xmm0
	; SKX-NEXT: vpcmpeqq %xmm1, %xmm0, %k1			; SKX-NEXT: vpcmpeqq %xmm1, %xmm0, %k1
	; SKX-NEXT: vmovdqa64 {{.*}}(%rip), %xmm0 {%k1} {z}			; SKX-NEXT: vmovdqa64 {{.*}}(%rip), %xmm0 {%k1} {z}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <2 x i8> %x, %y			%mask = icmp eq <2 x i8> %x, %y
	%1 = zext <2 x i1> %mask to <2 x i64>			%1 = zext <2 x i1> %mask to <2 x i64>
	ret <2 x i64> %1			ret <2 x i64> %1
	}			}

llvm/trunk/test/CodeGen/X86/avx512-gather-scatter-intrin.ll

	Show First 20 Lines • Show All 326 Lines • ▼ Show 20 Lines
	}			}

	declare <4 x i64> @llvm.x86.avx512.gather3div4.di(<4 x i64>, i8*, <4 x i64>, i8, i32)			declare <4 x i64> @llvm.x86.avx512.gather3div4.di(<4 x i64>, i8*, <4 x i64>, i8, i32)

	define <4 x i64>@test_int_x86_avx512_gather3div4_di(<4 x i64> %x0, i8* %x1, <4 x i64> %x2, i8 %x3) {			define <4 x i64>@test_int_x86_avx512_gather3div4_di(<4 x i64> %x0, i8* %x1, <4 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_gather3div4_di:			; CHECK-LABEL: test_int_x86_avx512_gather3div4_di:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %esi, %k1			; CHECK-NEXT: kmovb %esi, %k1
	; CHECK-NEXT: vmovdqa64 %ymm0, %ymm2			; CHECK-NEXT: vmovdqa %ymm0, %ymm2
	; CHECK-NEXT: vpgatherqq (%rdi,%ymm1,8), %ymm2 {%k1}			; CHECK-NEXT: vpgatherqq (%rdi,%ymm1,8), %ymm2 {%k1}
	; CHECK-NEXT: kxnorw %k0, %k0, %k1			; CHECK-NEXT: kxnorw %k0, %k0, %k1
	; CHECK-NEXT: vpgatherqq (%rdi,%ymm1,8), %ymm0 {%k1}			; CHECK-NEXT: vpgatherqq (%rdi,%ymm1,8), %ymm0 {%k1}
	; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <4 x i64> @llvm.x86.avx512.gather3div4.di(<4 x i64> %x0, i8* %x1, <4 x i64> %x2, i8 %x3, i32 8)			%res = call <4 x i64> @llvm.x86.avx512.gather3div4.di(<4 x i64> %x0, i8* %x1, <4 x i64> %x2, i8 %x3, i32 8)
	%res1 = call <4 x i64> @llvm.x86.avx512.gather3div4.di(<4 x i64> %x0, i8* %x1, <4 x i64> %x2, i8 -1, i32 8)			%res1 = call <4 x i64> @llvm.x86.avx512.gather3div4.di(<4 x i64> %x0, i8* %x1, <4 x i64> %x2, i8 -1, i32 8)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	Show All 20 Lines

	declare <4 x i32> @llvm.x86.avx512.gather3div4.si(<4 x i32>, i8*, <2 x i64>, i8, i32)			declare <4 x i32> @llvm.x86.avx512.gather3div4.si(<4 x i32>, i8*, <2 x i64>, i8, i32)

	define <4 x i32>@test_int_x86_avx512_gather3div4_si(<4 x i32> %x0, i8* %x1, <2 x i64> %x2, i8 %x3) {			define <4 x i32>@test_int_x86_avx512_gather3div4_si(<4 x i32> %x0, i8* %x1, <2 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_gather3div4_si:			; CHECK-LABEL: test_int_x86_avx512_gather3div4_si:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %esi, %k1			; CHECK-NEXT: kmovb %esi, %k1
	; CHECK-NEXT: kxnorw %k0, %k0, %k2			; CHECK-NEXT: kxnorw %k0, %k0, %k2
	; CHECK-NEXT: vmovdqa64 %xmm0, %xmm2			; CHECK-NEXT: vmovdqa %xmm0, %xmm2
	; CHECK-NEXT: vpgatherqd (%rdi,%xmm1,4), %xmm2 {%k2}			; CHECK-NEXT: vpgatherqd (%rdi,%xmm1,4), %xmm2 {%k2}
	; CHECK-NEXT: vpgatherqd (%rdi,%xmm1,4), %xmm0 {%k1}			; CHECK-NEXT: vpgatherqd (%rdi,%xmm1,4), %xmm0 {%k1}
	; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0			; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <4 x i32> @llvm.x86.avx512.gather3div4.si(<4 x i32> %x0, i8* %x1, <2 x i64> %x2, i8 -1, i32 4)			%res = call <4 x i32> @llvm.x86.avx512.gather3div4.si(<4 x i32> %x0, i8* %x1, <2 x i64> %x2, i8 -1, i32 4)
	%res1 = call <4 x i32> @llvm.x86.avx512.gather3div4.si(<4 x i32> %x0, i8* %x1, <2 x i64> %x2, i8 %x3, i32 4)			%res1 = call <4 x i32> @llvm.x86.avx512.gather3div4.si(<4 x i32> %x0, i8* %x1, <2 x i64> %x2, i8 %x3, i32 4)
	%res2 = add <4 x i32> %res, %res1			%res2 = add <4 x i32> %res, %res1
	ret <4 x i32> %res2			ret <4 x i32> %res2
	Show All 18 Lines
	}			}

	declare <4 x i32> @llvm.x86.avx512.gather3div8.si(<4 x i32>, i8*, <4 x i64>, i8, i32)			declare <4 x i32> @llvm.x86.avx512.gather3div8.si(<4 x i32>, i8*, <4 x i64>, i8, i32)

	define <4 x i32>@test_int_x86_avx512_gather3div8_si(<4 x i32> %x0, i8* %x1, <4 x i64> %x2, i8 %x3) {			define <4 x i32>@test_int_x86_avx512_gather3div8_si(<4 x i32> %x0, i8* %x1, <4 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_gather3div8_si:			; CHECK-LABEL: test_int_x86_avx512_gather3div8_si:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %esi, %k1			; CHECK-NEXT: kmovb %esi, %k1
	; CHECK-NEXT: vmovdqa64 %xmm0, %xmm2			; CHECK-NEXT: vmovdqa %xmm0, %xmm2
	; CHECK-NEXT: kmovq %k1, %k2			; CHECK-NEXT: kmovq %k1, %k2
	; CHECK-NEXT: vpgatherqd (%rdi,%ymm1,4), %xmm2 {%k2}			; CHECK-NEXT: vpgatherqd (%rdi,%ymm1,4), %xmm2 {%k2}
	; CHECK-NEXT: vpgatherqd (%rdi,%ymm1,2), %xmm0 {%k1}			; CHECK-NEXT: vpgatherqd (%rdi,%ymm1,2), %xmm0 {%k1}
	; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0			; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <4 x i32> @llvm.x86.avx512.gather3div8.si(<4 x i32> %x0, i8* %x1, <4 x i64> %x2, i8 %x3, i32 4)			%res = call <4 x i32> @llvm.x86.avx512.gather3div8.si(<4 x i32> %x0, i8* %x1, <4 x i64> %x2, i8 %x3, i32 4)
	%res1 = call <4 x i32> @llvm.x86.avx512.gather3div8.si(<4 x i32> %x0, i8* %x1, <4 x i64> %x2, i8 %x3, i32 2)			%res1 = call <4 x i32> @llvm.x86.avx512.gather3div8.si(<4 x i32> %x0, i8* %x1, <4 x i64> %x2, i8 %x3, i32 2)
	%res2 = add <4 x i32> %res, %res1			%res2 = add <4 x i32> %res, %res1
	▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines

	declare <4 x i32> @llvm.x86.avx512.gather3siv4.si(<4 x i32>, i8*, <4 x i32>, i8, i32)			declare <4 x i32> @llvm.x86.avx512.gather3siv4.si(<4 x i32>, i8*, <4 x i32>, i8, i32)

	define <4 x i32>@test_int_x86_avx512_gather3siv4_si(<4 x i32> %x0, i8* %x1, <4 x i32> %x2, i8 %x3) {			define <4 x i32>@test_int_x86_avx512_gather3siv4_si(<4 x i32> %x0, i8* %x1, <4 x i32> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_gather3siv4_si:			; CHECK-LABEL: test_int_x86_avx512_gather3siv4_si:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %esi, %k1			; CHECK-NEXT: kmovb %esi, %k1
	; CHECK-NEXT: kxnorw %k0, %k0, %k2			; CHECK-NEXT: kxnorw %k0, %k0, %k2
	; CHECK-NEXT: vmovdqa64 %xmm0, %xmm2			; CHECK-NEXT: vmovdqa %xmm0, %xmm2
	; CHECK-NEXT: vpgatherdd (%rdi,%xmm1,4), %xmm2 {%k2}			; CHECK-NEXT: vpgatherdd (%rdi,%xmm1,4), %xmm2 {%k2}
	; CHECK-NEXT: vpgatherdd (%rdi,%xmm1,2), %xmm0 {%k1}			; CHECK-NEXT: vpgatherdd (%rdi,%xmm1,2), %xmm0 {%k1}
	; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0			; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <4 x i32> @llvm.x86.avx512.gather3siv4.si(<4 x i32> %x0, i8* %x1, <4 x i32> %x2, i8 -1, i32 4)			%res = call <4 x i32> @llvm.x86.avx512.gather3siv4.si(<4 x i32> %x0, i8* %x1, <4 x i32> %x2, i8 -1, i32 4)
	%res1 = call <4 x i32> @llvm.x86.avx512.gather3siv4.si(<4 x i32> %x0, i8* %x1, <4 x i32> %x2, i8 %x3, i32 2)			%res1 = call <4 x i32> @llvm.x86.avx512.gather3siv4.si(<4 x i32> %x0, i8* %x1, <4 x i32> %x2, i8 %x3, i32 2)
	%res2 = add <4 x i32> %res, %res1			%res2 = add <4 x i32> %res, %res1
	ret <4 x i32> %res2			ret <4 x i32> %res2
	Show All 18 Lines
	}			}

	declare <8 x i32> @llvm.x86.avx512.gather3siv8.si(<8 x i32>, i8*, <8 x i32>, i8, i32)			declare <8 x i32> @llvm.x86.avx512.gather3siv8.si(<8 x i32>, i8*, <8 x i32>, i8, i32)

	define <8 x i32>@test_int_x86_avx512_gather3siv8_si(<8 x i32> %x0, i8* %x1, <8 x i32> %x2, i8 %x3) {			define <8 x i32>@test_int_x86_avx512_gather3siv8_si(<8 x i32> %x0, i8* %x1, <8 x i32> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_gather3siv8_si:			; CHECK-LABEL: test_int_x86_avx512_gather3siv8_si:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %esi, %k1			; CHECK-NEXT: kmovb %esi, %k1
	; CHECK-NEXT: vmovdqa64 %ymm0, %ymm2			; CHECK-NEXT: vmovdqa %ymm0, %ymm2
	; CHECK-NEXT: kmovq %k1, %k2			; CHECK-NEXT: kmovq %k1, %k2
	; CHECK-NEXT: vpgatherdd (%rdi,%ymm1,4), %ymm2 {%k2}			; CHECK-NEXT: vpgatherdd (%rdi,%ymm1,4), %ymm2 {%k2}
	; CHECK-NEXT: vpgatherdd (%rdi,%ymm1,2), %ymm0 {%k1}			; CHECK-NEXT: vpgatherdd (%rdi,%ymm1,2), %ymm0 {%k1}
	; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0			; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <8 x i32> @llvm.x86.avx512.gather3siv8.si(<8 x i32> %x0, i8* %x1, <8 x i32> %x2, i8 %x3, i32 4)			%res = call <8 x i32> @llvm.x86.avx512.gather3siv8.si(<8 x i32> %x0, i8* %x1, <8 x i32> %x2, i8 %x3, i32 4)
	%res1 = call <8 x i32> @llvm.x86.avx512.gather3siv8.si(<8 x i32> %x0, i8* %x1, <8 x i32> %x2, i8 %x3, i32 2)			%res1 = call <8 x i32> @llvm.x86.avx512.gather3siv8.si(<8 x i32> %x0, i8* %x1, <8 x i32> %x2, i8 %x3, i32 2)
	%res2 = add <8 x i32> %res, %res1			%res2 = add <8 x i32> %res, %res1
	▲ Show 20 Lines • Show All 294 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll

	Show First 20 Lines • Show All 538 Lines • ▼ Show 20 Lines
	; SKX-LABEL: test16:			; SKX-LABEL: test16:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: kmovq %rdi, %k0			; SKX-NEXT: kmovq %rdi, %k0
	; SKX-NEXT: kxnorw %k0, %k0, %k1			; SKX-NEXT: kxnorw %k0, %k0, %k1
	; SKX-NEXT: kshiftrw $15, %k1, %k1			; SKX-NEXT: kshiftrw $15, %k1, %k1
	; SKX-NEXT: vpmovm2b %k1, %zmm0			; SKX-NEXT: vpmovm2b %k1, %zmm0
	; SKX-NEXT: vpsllq $40, %xmm0, %xmm0			; SKX-NEXT: vpsllq $40, %xmm0, %xmm0
	; SKX-NEXT: vpmovm2b %k0, %zmm1			; SKX-NEXT: vpmovm2b %k0, %zmm1
	; SKX-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,255,255,255,255,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]			; SKX-NEXT: vmovdqu {{.*#+}} ymm2 = [255,255,255,255,255,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
	; SKX-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; SKX-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; SKX-NEXT: vextracti64x4 $1, %zmm1, %ymm1			; SKX-NEXT: vextracti64x4 $1, %zmm1, %ymm1
	; SKX-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0			; SKX-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0
	; SKX-NEXT: vpmovb2m %zmm0, %k0			; SKX-NEXT: vpmovb2m %zmm0, %k0
	; SKX-NEXT: vpmovm2b %k0, %zmm0			; SKX-NEXT: vpmovm2b %k0, %zmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%a = bitcast i64 %x to <64 x i1>			%a = bitcast i64 %x to <64 x i1>
	%b = insertelement <64 x i1>%a, i1 true, i32 5			%b = insertelement <64 x i1>%a, i1 true, i32 5
	▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
	; SKX-NEXT: kmovq %rdi, %k0			; SKX-NEXT: kmovq %rdi, %k0
	; SKX-NEXT: cmpl %edx, %esi			; SKX-NEXT: cmpl %edx, %esi
	; SKX-NEXT: setg %al			; SKX-NEXT: setg %al
	; SKX-NEXT: andl $1, %eax			; SKX-NEXT: andl $1, %eax
	; SKX-NEXT: kmovw %eax, %k1			; SKX-NEXT: kmovw %eax, %k1
	; SKX-NEXT: vpmovm2b %k1, %zmm0			; SKX-NEXT: vpmovm2b %k1, %zmm0
	; SKX-NEXT: vpsllq $40, %xmm0, %xmm0			; SKX-NEXT: vpsllq $40, %xmm0, %xmm0
	; SKX-NEXT: vpmovm2b %k0, %zmm1			; SKX-NEXT: vpmovm2b %k0, %zmm1
	; SKX-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,255,255,255,255,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]			; SKX-NEXT: vmovdqu {{.*#+}} ymm2 = [255,255,255,255,255,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
	; SKX-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; SKX-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; SKX-NEXT: vextracti64x4 $1, %zmm1, %ymm1			; SKX-NEXT: vextracti64x4 $1, %zmm1, %ymm1
	; SKX-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0			; SKX-NEXT: vinserti64x4 $1, %ymm1, %zmm0, %zmm0
	; SKX-NEXT: vpmovb2m %zmm0, %k0			; SKX-NEXT: vpmovb2m %zmm0, %k0
	; SKX-NEXT: vpmovm2b %k0, %zmm0			; SKX-NEXT: vpmovm2b %k0, %zmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%a = bitcast i64 %x to <64 x i1>			%a = bitcast i64 %x to <64 x i1>
	%b = icmp sgt i32 %y, %z			%b = icmp sgt i32 %y, %z
	▲ Show 20 Lines • Show All 1,375 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-masked_memop-16-8.ll

	Show All 15 Lines
	declare <16 x i8> @llvm.masked.load.v16i8.p0v16i8(<16 x i8>*, i32, <16 x i1>, <16 x i8>)			declare <16 x i8> @llvm.masked.load.v16i8.p0v16i8(<16 x i8>*, i32, <16 x i1>, <16 x i8>)

	define <32 x i8> @test_mask_load_32xi8(<32 x i1> %mask, <32 x i8>* %addr, <32 x i8> %val) {			define <32 x i8> @test_mask_load_32xi8(<32 x i1> %mask, <32 x i8>* %addr, <32 x i8> %val) {
	; CHECK-LABEL: test_mask_load_32xi8:			; CHECK-LABEL: test_mask_load_32xi8:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpsllw $7, %ymm0, %ymm0			; CHECK-NEXT: vpsllw $7, %ymm0, %ymm0
	; CHECK-NEXT: vpmovb2m %ymm0, %k1			; CHECK-NEXT: vpmovb2m %ymm0, %k1
	; CHECK-NEXT: vmovdqu8 (%rdi), %ymm1 {%k1}			; CHECK-NEXT: vmovdqu8 (%rdi), %ymm1 {%k1}
	; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0			; CHECK-NEXT: vmovdqa %ymm1, %ymm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call <32 x i8> @llvm.masked.load.v32i8.p0v32i8(<32 x i8>* %addr, i32 4, <32 x i1>%mask, <32 x i8> %val)			%res = call <32 x i8> @llvm.masked.load.v32i8.p0v32i8(<32 x i8>* %addr, i32 4, <32 x i1>%mask, <32 x i8> %val)
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}
	declare <32 x i8> @llvm.masked.load.v32i8.p0v32i8(<32 x i8>*, i32, <32 x i1>, <32 x i8>)			declare <32 x i8> @llvm.masked.load.v32i8.p0v32i8(<32 x i8>*, i32, <32 x i1>, <32 x i8>)

	define <64 x i8> @test_mask_load_64xi8(<64 x i1> %mask, <64 x i8>* %addr, <64 x i8> %val) {			define <64 x i8> @test_mask_load_64xi8(<64 x i1> %mask, <64 x i8>* %addr, <64 x i8> %val) {
	; CHECK-LABEL: test_mask_load_64xi8:			; CHECK-LABEL: test_mask_load_64xi8:
	▲ Show 20 Lines • Show All 120 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-mov.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -march=x86-64 -mtriple=x86_64-apple-darwin -mcpu=knl --show-mc-encoding\| FileCheck %s			; RUN: llc < %s -march=x86-64 -mtriple=x86_64-apple-darwin -mcpu=knl --show-mc-encoding\| FileCheck %s

	define i32 @test1(float %x) {			define i32 @test1(float %x) {
	; CHECK-LABEL: test1:			; CHECK-LABEL: test1:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovd %xmm0, %eax ## encoding: [0x62,0xf1,0x7d,0x08,0x7e,0xc0]			; CHECK-NEXT: vmovd %xmm0, %eax ## encoding: [0x62,0xf1,0x7d,0x08,0x7e,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = bitcast float %x to i32			%res = bitcast float %x to i32
	ret i32 %res			ret i32 %res
	}			}

	define <4 x i32> @test2(i32 %x) {			define <4 x i32> @test2(i32 %x) {
	; CHECK-LABEL: test2:			; CHECK-LABEL: test2:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovd %edi, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc7]			; CHECK-NEXT: vmovd %edi, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc7]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = insertelement <4 x i32>undef, i32 %x, i32 0			%res = insertelement <4 x i32>undef, i32 %x, i32 0
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <2 x i64> @test3(i64 %x) {			define <2 x i64> @test3(i64 %x) {
	; CHECK-LABEL: test3:			; CHECK-LABEL: test3:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovq %rdi, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6e,0xc7]			; CHECK-NEXT: vmovq %rdi, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe1,0xf9,0x6e,0xc7]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = insertelement <2 x i64>undef, i64 %x, i32 0			%res = insertelement <2 x i64>undef, i64 %x, i32 0
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define <4 x i32> @test4(i32* %x) {			define <4 x i32> @test4(i32* %x) {
	; CHECK-LABEL: test4:			; CHECK-LABEL: test4:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovss (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x10,0x07]			; CHECK-NEXT: vmovss (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x07]
	; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero			; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%y = load i32, i32* %x			%y = load i32, i32* %x
	%res = insertelement <4 x i32>undef, i32 %y, i32 0			%res = insertelement <4 x i32>undef, i32 %y, i32 0
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define void @test5(float %x, float* %y) {			define void @test5(float %x, float* %y) {
	; CHECK-LABEL: test5:			; CHECK-LABEL: test5:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovss %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7e,0x08,0x11,0x07]			; CHECK-NEXT: vmovss %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	store float %x, float* %y, align 4			store float %x, float* %y, align 4
	ret void			ret void
	}			}

	define void @test6(double %x, double* %y) {			define void @test6(double %x, double* %y) {
	; CHECK-LABEL: test6:			; CHECK-LABEL: test6:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovsd %xmm0, (%rdi) ## encoding: [0x62,0xf1,0xff,0x08,0x11,0x07]			; CHECK-NEXT: vmovsd %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	store double %x, double* %y, align 8			store double %x, double* %y, align 8
	ret void			ret void
	}			}

	define float @test7(i32* %x) {			define float @test7(i32* %x) {
	; CHECK-LABEL: test7:			; CHECK-LABEL: test7:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovss (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x10,0x07]			; CHECK-NEXT: vmovss (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x07]
	; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero			; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%y = load i32, i32* %x			%y = load i32, i32* %x
	%res = bitcast i32 %y to float			%res = bitcast i32 %y to float
	ret float %res			ret float %res
	}			}

	define i32 @test8(<4 x i32> %x) {			define i32 @test8(<4 x i32> %x) {
	; CHECK-LABEL: test8:			; CHECK-LABEL: test8:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovd %xmm0, %eax ## encoding: [0x62,0xf1,0x7d,0x08,0x7e,0xc0]			; CHECK-NEXT: vmovd %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x7e,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = extractelement <4 x i32> %x, i32 0			%res = extractelement <4 x i32> %x, i32 0
	ret i32 %res			ret i32 %res
	}			}

	define i64 @test9(<2 x i64> %x) {			define i64 @test9(<2 x i64> %x) {
	; CHECK-LABEL: test9:			; CHECK-LABEL: test9:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovq %xmm0, %rax ## encoding: [0x62,0xf1,0xfd,0x08,0x7e,0xc0]			; CHECK-NEXT: vmovq %xmm0, %rax ## EVEX TO VEX Compression encoding: [0xc4,0xe1,0xf9,0x7e,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = extractelement <2 x i64> %x, i32 0			%res = extractelement <2 x i64> %x, i32 0
	ret i64 %res			ret i64 %res
	}			}

	define <4 x i32> @test10(i32* %x) {			define <4 x i32> @test10(i32* %x) {
	; CHECK-LABEL: test10:			; CHECK-LABEL: test10:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovss (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x10,0x07]			; CHECK-NEXT: vmovss (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x07]
	; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero			; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%y = load i32, i32* %x, align 4			%y = load i32, i32* %x, align 4
	%res = insertelement <4 x i32>zeroinitializer, i32 %y, i32 0			%res = insertelement <4 x i32>zeroinitializer, i32 %y, i32 0
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <4 x float> @test11(float* %x) {			define <4 x float> @test11(float* %x) {
	; CHECK-LABEL: test11:			; CHECK-LABEL: test11:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovss (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x10,0x07]			; CHECK-NEXT: vmovss (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x07]
	; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero			; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%y = load float, float* %x, align 4			%y = load float, float* %x, align 4
	%res = insertelement <4 x float>zeroinitializer, float %y, i32 0			%res = insertelement <4 x float>zeroinitializer, float %y, i32 0
	ret <4 x float>%res			ret <4 x float>%res
	}			}

	define <2 x double> @test12(double* %x) {			define <2 x double> @test12(double* %x) {
	; CHECK-LABEL: test12:			; CHECK-LABEL: test12:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovsd (%rdi), %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0x10,0x07]			; CHECK-NEXT: vmovsd (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x10,0x07]
	; CHECK-NEXT: ## xmm0 = mem[0],zero			; CHECK-NEXT: ## xmm0 = mem[0],zero
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%y = load double, double* %x, align 8			%y = load double, double* %x, align 8
	%res = insertelement <2 x double>zeroinitializer, double %y, i32 0			%res = insertelement <2 x double>zeroinitializer, double %y, i32 0
	ret <2 x double>%res			ret <2 x double>%res
	}			}

	define <2 x i64> @test13(i64 %x) {			define <2 x i64> @test13(i64 %x) {
	; CHECK-LABEL: test13:			; CHECK-LABEL: test13:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovq %rdi, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6e,0xc7]			; CHECK-NEXT: vmovq %rdi, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe1,0xf9,0x6e,0xc7]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = insertelement <2 x i64>zeroinitializer, i64 %x, i32 0			%res = insertelement <2 x i64>zeroinitializer, i64 %x, i32 0
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define <4 x i32> @test14(i32 %x) {			define <4 x i32> @test14(i32 %x) {
	; CHECK-LABEL: test14:			; CHECK-LABEL: test14:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovd %edi, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc7]			; CHECK-NEXT: vmovd %edi, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc7]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = insertelement <4 x i32>zeroinitializer, i32 %x, i32 0			%res = insertelement <4 x i32>zeroinitializer, i32 %x, i32 0
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <4 x i32> @test15(i32* %x) {			define <4 x i32> @test15(i32* %x) {
	; CHECK-LABEL: test15:			; CHECK-LABEL: test15:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovss (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x10,0x07]			; CHECK-NEXT: vmovss (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x07]
	; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero			; CHECK-NEXT: ## xmm0 = mem[0],zero,zero,zero
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%y = load i32, i32* %x, align 4			%y = load i32, i32* %x, align 4
	%res = insertelement <4 x i32>zeroinitializer, i32 %y, i32 0			%res = insertelement <4 x i32>zeroinitializer, i32 %y, i32 0
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <16 x i32> @test16(i8 * %addr) {			define <16 x i32> @test16(i8 * %addr) {
	▲ Show 20 Lines • Show All 390 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-scalar.ll

	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl --show-mc-encoding \| FileCheck %s --check-prefix AVX512 --check-prefix AVX512-KNL			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl --show-mc-encoding \| FileCheck %s --check-prefix AVX512 --check-prefix AVX512-KNL
	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=skx --show-mc-encoding \| FileCheck %s --check-prefix AVX512 --check-prefix AVX512-SKX			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=skx --show-mc-encoding \| FileCheck %s --check-prefix AVX512 --check-prefix AVX512-SKX
	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=corei7-avx --show-mc-encoding \| FileCheck %s --check-prefix AVX			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=corei7-avx --show-mc-encoding \| FileCheck %s --check-prefix AVX

	; AVX512-LABEL: @test_fdiv			; AVX512-LABEL: @test_fdiv
	; AVX512: vdivss %xmm{{.*}} ## encoding: [0x62			; AVX512: vdivss %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	; AVX-LABEL: @test_fdiv			; AVX-LABEL: @test_fdiv
	; AVX: vdivss %xmm{{.*}} ## encoding: [0xc5			; AVX: vdivss %xmm{{.*}} ## encoding: [0xc5

	define float @test_fdiv(float %a, float %b) {			define float @test_fdiv(float %a, float %b) {
	%c = fdiv float %a, %b			%c = fdiv float %a, %b
	ret float %c			ret float %c
	}			}

	; AVX512-LABEL: @test_fsub			; AVX512-LABEL: @test_fsub
	; AVX512: vsubss %xmm{{.*}} ## encoding: [0x62			; AVX512: vsubss %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	; AVX-LABEL: @test_fsub			; AVX-LABEL: @test_fsub
	; AVX: vsubss %xmm{{.*}} ## encoding: [0xc5			; AVX: vsubss %xmm{{.*}} ## encoding: [0xc5

	define float @test_fsub(float %a, float %b) {			define float @test_fsub(float %a, float %b) {
	%c = fsub float %a, %b			%c = fsub float %a, %b
	ret float %c			ret float %c
	}			}

	; AVX512-LABEL: @test_fadd			; AVX512-LABEL: @test_fadd
	; AVX512: vaddsd %xmm{{.*}} ## encoding: [0x62			; AVX512: vaddsd %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	; AVX-LABEL: @test_fadd			; AVX-LABEL: @test_fadd
	; AVX: vaddsd %xmm{{.*}} ## encoding: [0xc5			; AVX: vaddsd %xmm{{.*}} ## encoding: [0xc5

	define double @test_fadd(double %a, double %b) {			define double @test_fadd(double %a, double %b) {
	%c = fadd double %a, %b			%c = fadd double %a, %b
	ret double %c			ret double %c
	}			}

	Show All 10 Lines
	; AVX: vroundss			; AVX: vroundss

	define float @test_trunc(float %a) {			define float @test_trunc(float %a) {
	%c = call float @llvm.trunc.f32(float %a)			%c = call float @llvm.trunc.f32(float %a)
	ret float %c			ret float %c
	}			}

	; AVX512-LABEL: @test_sqrt			; AVX512-LABEL: @test_sqrt
	; AVX512: vsqrtsd %xmm{{.*}} ## encoding: [0x62			; AVX512: vsqrtsd %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	; AVX-LABEL: @test_sqrt			; AVX-LABEL: @test_sqrt
	; AVX: vsqrtsd %xmm{{.*}} ## encoding: [0xc5			; AVX: vsqrtsd %xmm{{.*}} ## encoding: [0xc5

	define double @test_sqrt(double %a) {			define double @test_sqrt(double %a) {
	%c = call double @llvm.sqrt.f64(double %a)			%c = call double @llvm.sqrt.f64(double %a)
	ret double %c			ret double %c
	}			}

	; AVX512-LABEL: @test_rint			; AVX512-LABEL: @test_rint
	; AVX512: vrndscaless			; AVX512: vrndscaless
	; AVX-LABEL: @test_rint			; AVX-LABEL: @test_rint
	; AVX: vroundss			; AVX: vroundss

	define float @test_rint(float %a) {			define float @test_rint(float %a) {
	%c = call float @llvm.rint.f32(float %a)			%c = call float @llvm.rint.f32(float %a)
	ret float %c			ret float %c
	}			}

	; AVX512-LABEL: @test_vmax			; AVX512-LABEL: @test_vmax
	; AVX512: vmaxss %xmm{{.*}} ## encoding: [0x62			; AVX512: vmaxss %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	; AVX-LABEL: @test_vmax			; AVX-LABEL: @test_vmax
	; AVX: vmaxss %xmm{{.*}} ## encoding: [0xc5			; AVX: vmaxss %xmm{{.*}} ## encoding: [0xc5

	define float @test_vmax(float %i, float %j) {			define float @test_vmax(float %i, float %j) {
	%cmp_res = fcmp ogt float %i, %j			%cmp_res = fcmp ogt float %i, %j
	%max = select i1 %cmp_res, float %i, float %j			%max = select i1 %cmp_res, float %i, float %j
	ret float %max			ret float %max
	}			}

	; AVX512-LABEL: @test_mov			; AVX512-LABEL: @test_mov
	; AVX512: vcmpltss %xmm{{.*}} ## encoding: [0x62			; AVX512: vcmpltss %xmm{{.*}} ## encoding: [0x62
	; AVX-LABEL: @test_mov			; AVX-LABEL: @test_mov
	; AVX: vcmpltss %xmm{{.*}} ## encoding: [0xc5			; AVX: vcmpltss %xmm{{.*}} ## encoding: [0xc5

	define float @test_mov(float %a, float %b, float %i, float %j) {			define float @test_mov(float %a, float %b, float %i, float %j) {
	%cmp_res = fcmp ogt float %i, %j			%cmp_res = fcmp ogt float %i, %j
	%max = select i1 %cmp_res, float %b, float %a			%max = select i1 %cmp_res, float %b, float %a
	ret float %max			ret float %max
	}			}

	; AVX512-SKX-LABEL: @zero_float			; AVX512-SKX-LABEL: @zero_float
	; AVX512-SKX: vxorps %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0x62,			; AVX512-SKX: vxorps %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	; AVX512-KNL-LABEL: @zero_float			; AVX512-KNL-LABEL: @zero_float
	; AVX512-KNL: vxorps %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0xc5,			; AVX512-KNL: vxorps %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0xc5,
	; AVX-LABEL: @zero_float			; AVX-LABEL: @zero_float
	; AVX: vxorps %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0xc5,			; AVX: vxorps %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0xc5,

	define float @zero_float(float %a) {			define float @zero_float(float %a) {
	%b = fadd float %a, 0.0			%b = fadd float %a, 0.0
	ret float %b			ret float %b
	}			}

	; AVX512-SKX-LABEL: @zero_double			; AVX512-SKX-LABEL: @zero_double
	; AVX512-SKX: vxorpd %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0x62,			; AVX512-SKX: vxorpd %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	; AVX512-KNL-LABEL: @zero_double			; AVX512-KNL-LABEL: @zero_double
	; AVX512-KNL: vxorpd %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0xc5,			; AVX512-KNL: vxorpd %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0xc5,
	; AVX-LABEL: @zero_double			; AVX-LABEL: @zero_double
	; AVX: vxorpd %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0xc5,			; AVX: vxorpd %xmm{{.}}, %xmm{{.}}, %xmm{{.*}} ## encoding: [0xc5,

	define double @zero_double(double %a) {			define double @zero_double(double %a) {
	%b = fadd double %a, 0.0			%b = fadd double %a, 0.0
	ret double %b			ret double %b
	}			}

llvm/trunk/test/CodeGen/X86/avx512-vbroadcasti128.ll

	Show First 20 Lines • Show All 228 Lines • ▼ Show 20 Lines
	%2 = shufflevector <16 x i8> %1, <16 x i8> undef, <64 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>			%2 = shufflevector <16 x i8> %1, <16 x i8> undef, <64 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
	%3 = add <64 x i8> %2, <i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16, i8 17, i8 18, i8 19, i8 20, i8 21, i8 22, i8 23, i8 24, i8 25, i8 26, i8 27, i8 28, i8 29, i8 30, i8 31, i8 32, i8 33, i8 34, i8 35, i8 36, i8 37, i8 38, i8 39, i8 40, i8 41, i8 42, i8 43, i8 44, i8 45, i8 46, i8 47, i8 48, i8 49, i8 50, i8 51, i8 52, i8 53, i8 54, i8 55, i8 56, i8 57, i8 58, i8 59, i8 60, i8 61, i8 62, i8 63, i8 64>			%3 = add <64 x i8> %2, <i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16, i8 17, i8 18, i8 19, i8 20, i8 21, i8 22, i8 23, i8 24, i8 25, i8 26, i8 27, i8 28, i8 29, i8 30, i8 31, i8 32, i8 33, i8 34, i8 35, i8 36, i8 37, i8 38, i8 39, i8 40, i8 41, i8 42, i8 43, i8 44, i8 45, i8 46, i8 47, i8 48, i8 49, i8 50, i8 51, i8 52, i8 53, i8 54, i8 55, i8 56, i8 57, i8 58, i8 59, i8 60, i8 61, i8 62, i8 63, i8 64>
	ret <64 x i8> %3			ret <64 x i8> %3
	}			}

	define <8 x i32> @PR29088(<4 x i32>* %p0, <8 x float>* %p1) {			define <8 x i32> @PR29088(<4 x i32>* %p0, <8 x float>* %p1) {
	; X64-AVX512VL-LABEL: PR29088:			; X64-AVX512VL-LABEL: PR29088:
	; X64-AVX512VL: ## BB#0:			; X64-AVX512VL: ## BB#0:
	; X64-AVX512VL-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512VL-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512VL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; X64-AVX512VL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; X64-AVX512VL-NEXT: vmovdqa32 %ymm1, (%rsi)			; X64-AVX512VL-NEXT: vmovdqa %ymm1, (%rsi)
	; X64-AVX512VL-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512VL-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512VL-NEXT: retq			; X64-AVX512VL-NEXT: retq
	;			;
	; X64-AVX512BWVL-LABEL: PR29088:			; X64-AVX512BWVL-LABEL: PR29088:
	; X64-AVX512BWVL: ## BB#0:			; X64-AVX512BWVL: ## BB#0:
	; X64-AVX512BWVL-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512BWVL-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512BWVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; X64-AVX512BWVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; X64-AVX512BWVL-NEXT: vmovdqa32 %ymm1, (%rsi)			; X64-AVX512BWVL-NEXT: vmovdqa %ymm1, (%rsi)
	; X64-AVX512BWVL-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512BWVL-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512BWVL-NEXT: retq			; X64-AVX512BWVL-NEXT: retq
	;			;
	; X64-AVX512DQVL-LABEL: PR29088:			; X64-AVX512DQVL-LABEL: PR29088:
	; X64-AVX512DQVL: ## BB#0:			; X64-AVX512DQVL: ## BB#0:
	; X64-AVX512DQVL-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512DQVL-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512DQVL-NEXT: vxorps %ymm1, %ymm1, %ymm1			; X64-AVX512DQVL-NEXT: vxorps %ymm1, %ymm1, %ymm1
	; X64-AVX512DQVL-NEXT: vmovaps %ymm1, (%rsi)			; X64-AVX512DQVL-NEXT: vmovaps %ymm1, (%rsi)
	; X64-AVX512DQVL-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512DQVL-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512DQVL-NEXT: retq			; X64-AVX512DQVL-NEXT: retq
	%ld = load <4 x i32>, <4 x i32>* %p0			%ld = load <4 x i32>, <4 x i32>* %p0
	store <8 x float> zeroinitializer, <8 x float>* %p1			store <8 x float> zeroinitializer, <8 x float>* %p1
	%shuf = shufflevector <4 x i32> %ld, <4 x i32> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>			%shuf = shufflevector <4 x i32> %ld, <4 x i32> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
	ret <8 x i32> %shuf			ret <8 x i32> %shuf
	}			}

llvm/trunk/test/CodeGen/X86/avx512-vbroadcasti256.ll

	Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	%2 = shufflevector <8 x i32> %1, <8 x i32> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%2 = shufflevector <8 x i32> %1, <8 x i32> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	%3 = add <16 x i32> %2, <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16>			%3 = add <16 x i32> %2, <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16>
	ret <16 x i32> %3			ret <16 x i32> %3
	}			}

	define <32 x i16> @test_broadcast_16i16_32i16(<16 x i16> *%p) nounwind {			define <32 x i16> @test_broadcast_16i16_32i16(<16 x i16> *%p) nounwind {
	; X64-AVX512VL-LABEL: test_broadcast_16i16_32i16:			; X64-AVX512VL-LABEL: test_broadcast_16i16_32i16:
	; X64-AVX512VL: ## BB#0:			; X64-AVX512VL: ## BB#0:
	; X64-AVX512VL-NEXT: vmovdqa64 (%rdi), %ymm1			; X64-AVX512VL-NEXT: vmovdqa (%rdi), %ymm1
	; X64-AVX512VL-NEXT: vpaddw {{.*}}(%rip), %ymm1, %ymm0			; X64-AVX512VL-NEXT: vpaddw {{.*}}(%rip), %ymm1, %ymm0
	; X64-AVX512VL-NEXT: vpaddw {{.*}}(%rip), %ymm1, %ymm1			; X64-AVX512VL-NEXT: vpaddw {{.*}}(%rip), %ymm1, %ymm1
	; X64-AVX512VL-NEXT: retq			; X64-AVX512VL-NEXT: retq
	;			;
	; X64-AVX512BWVL-LABEL: test_broadcast_16i16_32i16:			; X64-AVX512BWVL-LABEL: test_broadcast_16i16_32i16:
	; X64-AVX512BWVL: ## BB#0:			; X64-AVX512BWVL: ## BB#0:
	; X64-AVX512BWVL-NEXT: vbroadcasti64x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3]			; X64-AVX512BWVL-NEXT: vbroadcasti64x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3]
	; X64-AVX512BWVL-NEXT: vpaddw {{.*}}(%rip), %zmm0, %zmm0			; X64-AVX512BWVL-NEXT: vpaddw {{.*}}(%rip), %zmm0, %zmm0
	; X64-AVX512BWVL-NEXT: retq			; X64-AVX512BWVL-NEXT: retq
	;			;
	; X64-AVX512DQVL-LABEL: test_broadcast_16i16_32i16:			; X64-AVX512DQVL-LABEL: test_broadcast_16i16_32i16:
	; X64-AVX512DQVL: ## BB#0:			; X64-AVX512DQVL: ## BB#0:
	; X64-AVX512DQVL-NEXT: vmovdqa64 (%rdi), %ymm1			; X64-AVX512DQVL-NEXT: vmovdqa (%rdi), %ymm1
	; X64-AVX512DQVL-NEXT: vpaddw {{.*}}(%rip), %ymm1, %ymm0			; X64-AVX512DQVL-NEXT: vpaddw {{.*}}(%rip), %ymm1, %ymm0
	; X64-AVX512DQVL-NEXT: vpaddw {{.*}}(%rip), %ymm1, %ymm1			; X64-AVX512DQVL-NEXT: vpaddw {{.*}}(%rip), %ymm1, %ymm1
	; X64-AVX512DQVL-NEXT: retq			; X64-AVX512DQVL-NEXT: retq
	%1 = load <16 x i16>, <16 x i16> *%p			%1 = load <16 x i16>, <16 x i16> *%p
	%2 = shufflevector <16 x i16> %1, <16 x i16> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>			%2 = shufflevector <16 x i16> %1, <16 x i16> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
	%3 = add <32 x i16> %2, <i16 1, i16 2, i16 3, i16 4, i16 5, i16 6, i16 7, i16 8, i16 9, i16 10, i16 11, i16 12, i16 13, i16 14, i16 15, i16 16, i16 17, i16 18, i16 19, i16 20, i16 21, i16 22, i16 23, i16 24, i16 25, i16 26, i16 27, i16 28, i16 29, i16 30, i16 31, i16 32>			%3 = add <32 x i16> %2, <i16 1, i16 2, i16 3, i16 4, i16 5, i16 6, i16 7, i16 8, i16 9, i16 10, i16 11, i16 12, i16 13, i16 14, i16 15, i16 16, i16 17, i16 18, i16 19, i16 20, i16 21, i16 22, i16 23, i16 24, i16 25, i16 26, i16 27, i16 28, i16 29, i16 30, i16 31, i16 32>
	ret <32 x i16> %3			ret <32 x i16> %3
	}			}

	define <64 x i8> @test_broadcast_32i8_64i8(<32 x i8> *%p) nounwind {			define <64 x i8> @test_broadcast_32i8_64i8(<32 x i8> *%p) nounwind {
	; X64-AVX512VL-LABEL: test_broadcast_32i8_64i8:			; X64-AVX512VL-LABEL: test_broadcast_32i8_64i8:
	; X64-AVX512VL: ## BB#0:			; X64-AVX512VL: ## BB#0:
	; X64-AVX512VL-NEXT: vmovdqa64 (%rdi), %ymm1			; X64-AVX512VL-NEXT: vmovdqa (%rdi), %ymm1
	; X64-AVX512VL-NEXT: vpaddb {{.*}}(%rip), %ymm1, %ymm0			; X64-AVX512VL-NEXT: vpaddb {{.*}}(%rip), %ymm1, %ymm0
	; X64-AVX512VL-NEXT: vpaddb {{.*}}(%rip), %ymm1, %ymm1			; X64-AVX512VL-NEXT: vpaddb {{.*}}(%rip), %ymm1, %ymm1
	; X64-AVX512VL-NEXT: retq			; X64-AVX512VL-NEXT: retq
	;			;
	; X64-AVX512BWVL-LABEL: test_broadcast_32i8_64i8:			; X64-AVX512BWVL-LABEL: test_broadcast_32i8_64i8:
	; X64-AVX512BWVL: ## BB#0:			; X64-AVX512BWVL: ## BB#0:
	; X64-AVX512BWVL-NEXT: vbroadcasti64x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3]			; X64-AVX512BWVL-NEXT: vbroadcasti64x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3]
	; X64-AVX512BWVL-NEXT: vpaddb {{.*}}(%rip), %zmm0, %zmm0			; X64-AVX512BWVL-NEXT: vpaddb {{.*}}(%rip), %zmm0, %zmm0
	; X64-AVX512BWVL-NEXT: retq			; X64-AVX512BWVL-NEXT: retq
	;			;
	; X64-AVX512DQVL-LABEL: test_broadcast_32i8_64i8:			; X64-AVX512DQVL-LABEL: test_broadcast_32i8_64i8:
	; X64-AVX512DQVL: ## BB#0:			; X64-AVX512DQVL: ## BB#0:
	; X64-AVX512DQVL-NEXT: vmovdqa64 (%rdi), %ymm1			; X64-AVX512DQVL-NEXT: vmovdqa (%rdi), %ymm1
	; X64-AVX512DQVL-NEXT: vpaddb {{.*}}(%rip), %ymm1, %ymm0			; X64-AVX512DQVL-NEXT: vpaddb {{.*}}(%rip), %ymm1, %ymm0
	; X64-AVX512DQVL-NEXT: vpaddb {{.*}}(%rip), %ymm1, %ymm1			; X64-AVX512DQVL-NEXT: vpaddb {{.*}}(%rip), %ymm1, %ymm1
	; X64-AVX512DQVL-NEXT: retq			; X64-AVX512DQVL-NEXT: retq
	%1 = load <32 x i8>, <32 x i8> *%p			%1 = load <32 x i8>, <32 x i8> *%p
	%2 = shufflevector <32 x i8> %1, <32 x i8> undef, <64 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>			%2 = shufflevector <32 x i8> %1, <32 x i8> undef, <64 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
	%3 = add <64 x i8> %2, <i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16, i8 17, i8 18, i8 19, i8 20, i8 21, i8 22, i8 23, i8 24, i8 25, i8 26, i8 27, i8 28, i8 29, i8 30, i8 31, i8 32, i8 33, i8 34, i8 35, i8 36, i8 37, i8 38, i8 39, i8 40, i8 41, i8 42, i8 43, i8 44, i8 45, i8 46, i8 47, i8 48, i8 49, i8 50, i8 51, i8 52, i8 53, i8 54, i8 55, i8 56, i8 57, i8 58, i8 59, i8 60, i8 61, i8 62, i8 63, i8 64>			%3 = add <64 x i8> %2, <i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16, i8 17, i8 18, i8 19, i8 20, i8 21, i8 22, i8 23, i8 24, i8 25, i8 26, i8 27, i8 28, i8 29, i8 30, i8 31, i8 32, i8 33, i8 34, i8 35, i8 36, i8 37, i8 38, i8 39, i8 40, i8 41, i8 42, i8 43, i8 44, i8 45, i8 46, i8 47, i8 48, i8 49, i8 50, i8 51, i8 52, i8 53, i8 54, i8 55, i8 56, i8 57, i8 58, i8 59, i8 60, i8 61, i8 62, i8 63, i8 64>
	ret <64 x i8> %3			ret <64 x i8> %3
	}			}

llvm/trunk/test/CodeGen/X86/avx512-vec-cmp.ll

	Show First 20 Lines • Show All 1,184 Lines • ▼ Show 20 Lines
	; KNL-NEXT: vpxor %xmm2, %xmm2, %xmm2			; KNL-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; KNL-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3],xmm1[4],xmm2[5],xmm1[6],xmm2[7]			; KNL-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3],xmm1[4],xmm2[5],xmm1[6],xmm2[7]
	; KNL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3],xmm0[4],xmm2[5],xmm0[6],xmm2[7]			; KNL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3],xmm0[4],xmm2[5],xmm0[6],xmm2[7]
	; KNL-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0			; KNL-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: test44:			; SKX-LABEL: test44:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3],xmm1[4],xmm2[5],xmm1[6],xmm2[7]			; SKX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3],xmm1[4],xmm2[5],xmm1[6],xmm2[7]
	; SKX-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3],xmm0[4],xmm2[5],xmm0[6],xmm2[7]			; SKX-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3],xmm0[4],xmm2[5],xmm0[6],xmm2[7]
	; SKX-NEXT: vpcmpeqd %xmm1, %xmm0, %k0			; SKX-NEXT: vpcmpeqd %xmm1, %xmm0, %k0
	; SKX-NEXT: vpmovm2d %k0, %xmm0			; SKX-NEXT: vpmovm2d %k0, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <4 x i16> %x, %y			%mask = icmp eq <4 x i16> %x, %y
	%1 = sext <4 x i1> %mask to <4 x i32>			%1 = sext <4 x i1> %mask to <4 x i32>
	ret <4 x i32> %1			ret <4 x i32> %1
	}			}

	define <2 x i64> @test45(<2 x i16> %x, <2 x i16> %y) #0 {			define <2 x i64> @test45(<2 x i16> %x, <2 x i16> %y) #0 {
	; KNL-LABEL: test45:			; KNL-LABEL: test45:
	; KNL: ## BB#0:			; KNL: ## BB#0:
	; KNL-NEXT: vpxor %xmm2, %xmm2, %xmm2			; KNL-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; KNL-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3],xmm1[4],xmm2[5,6,7]			; KNL-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3],xmm1[4],xmm2[5,6,7]
	; KNL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1,2,3],xmm0[4],xmm2[5,6,7]			; KNL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1,2,3],xmm0[4],xmm2[5,6,7]
	; KNL-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0			; KNL-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
	; KNL-NEXT: vpsrlq $63, %xmm0, %xmm0			; KNL-NEXT: vpsrlq $63, %xmm0, %xmm0
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	; SKX-LABEL: test45:			; SKX-LABEL: test45:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3],xmm1[4],xmm2[5,6,7]			; SKX-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0],xmm2[1,2,3],xmm1[4],xmm2[5,6,7]
	; SKX-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1,2,3],xmm0[4],xmm2[5,6,7]			; SKX-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm2[1,2,3],xmm0[4],xmm2[5,6,7]
	; SKX-NEXT: vpcmpeqq %xmm1, %xmm0, %k1			; SKX-NEXT: vpcmpeqq %xmm1, %xmm0, %k1
	; SKX-NEXT: vmovdqa64 {{.*}}(%rip), %xmm0 {%k1} {z}			; SKX-NEXT: vmovdqa64 {{.*}}(%rip), %xmm0 {%k1} {z}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <2 x i16> %x, %y			%mask = icmp eq <2 x i16> %x, %y
	%1 = zext <2 x i1> %mask to <2 x i64>			%1 = zext <2 x i1> %mask to <2 x i64>
	ret <2 x i64> %1			ret <2 x i64> %1
	Show All 23 Lines

llvm/trunk/test/CodeGen/X86/avx512bwvl-intrinsics-upgrade.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512bw -mattr=+avx512vl --show-mc-encoding\| FileCheck %s		; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512bw -mattr=+avx512vl --show-mc-encoding\| FileCheck %s

declare <32 x i8> @llvm.x86.avx512.pbroadcastb.256(<16 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.pbroadcastb.256(<16 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_pbroadcastb_256(<16 x i8> %x0, <32 x i8> %x1, i32 %mask) {		define <32 x i8>@test_int_x86_avx512_pbroadcastb_256(<16 x i8> %x0, <32 x i8> %x1, i32 %mask) {
; CHECK-LABEL: test_int_x86_avx512_pbroadcastb_256:		; CHECK-LABEL: test_int_x86_avx512_pbroadcastb_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpbroadcastb %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x78,0xd0]		; CHECK-NEXT: vpbroadcastb %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x78,0xd0]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpbroadcastb %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x78,0xc8]		; CHECK-NEXT: vpbroadcastb %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x78,0xc8]
; CHECK-NEXT: vpbroadcastb %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x78,0xc0]		; CHECK-NEXT: vpbroadcastb %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x78,0xc0]
; CHECK-NEXT: vpaddb %ymm1, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc9]		; CHECK-NEXT: vpaddb %ymm1, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc9]
; CHECK-NEXT: vpaddb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfc,0xc1]		; CHECK-NEXT: vpaddb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfc,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.pbroadcastb.256(<16 x i8> %x0, <32 x i8> %x1, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.pbroadcastb.256(<16 x i8> %x0, <32 x i8> %x1, i32 -1)
%res1 = call <32 x i8> @llvm.x86.avx512.pbroadcastb.256(<16 x i8> %x0, <32 x i8> %x1, i32 %mask)		%res1 = call <32 x i8> @llvm.x86.avx512.pbroadcastb.256(<16 x i8> %x0, <32 x i8> %x1, i32 %mask)
%res2 = call <32 x i8> @llvm.x86.avx512.pbroadcastb.256(<16 x i8> %x0, <32 x i8> zeroinitializer, i32 %mask)		%res2 = call <32 x i8> @llvm.x86.avx512.pbroadcastb.256(<16 x i8> %x0, <32 x i8> zeroinitializer, i32 %mask)
%res3 = add <32 x i8> %res, %res1		%res3 = add <32 x i8> %res, %res1
%res4 = add <32 x i8> %res2, %res3		%res4 = add <32 x i8> %res2, %res3
ret <32 x i8> %res4		ret <32 x i8> %res4
}		}

declare <16 x i8> @llvm.x86.avx512.pbroadcastb.128(<16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.pbroadcastb.128(<16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_pbroadcastb_128(<16 x i8> %x0, <16 x i8> %x1, i16 %mask) {		define <16 x i8>@test_int_x86_avx512_pbroadcastb_128(<16 x i8> %x0, <16 x i8> %x1, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_pbroadcastb_128:		; CHECK-LABEL: test_int_x86_avx512_pbroadcastb_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpbroadcastb %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x78,0xd0]		; CHECK-NEXT: vpbroadcastb %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x78,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpbroadcastb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x78,0xc8]		; CHECK-NEXT: vpbroadcastb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x78,0xc8]
; CHECK-NEXT: vpbroadcastb %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x78,0xc0]		; CHECK-NEXT: vpbroadcastb %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x78,0xc0]
; CHECK-NEXT: vpaddb %xmm1, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc9]		; CHECK-NEXT: vpaddb %xmm1, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc9]
; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc1]		; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.pbroadcastb.128(<16 x i8> %x0, <16 x i8> %x1, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.pbroadcastb.128(<16 x i8> %x0, <16 x i8> %x1, i16 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.pbroadcastb.128(<16 x i8> %x0, <16 x i8> %x1, i16 %mask)		%res1 = call <16 x i8> @llvm.x86.avx512.pbroadcastb.128(<16 x i8> %x0, <16 x i8> %x1, i16 %mask)
%res2 = call <16 x i8> @llvm.x86.avx512.pbroadcastb.128(<16 x i8> %x0, <16 x i8> zeroinitializer, i16 %mask)		%res2 = call <16 x i8> @llvm.x86.avx512.pbroadcastb.128(<16 x i8> %x0, <16 x i8> zeroinitializer, i16 %mask)
%res3 = add <16 x i8> %res, %res1		%res3 = add <16 x i8> %res, %res1
%res4 = add <16 x i8> %res2, %res3		%res4 = add <16 x i8> %res2, %res3
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.pbroadcastw.256(<8 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.pbroadcastw.256(<8 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_pbroadcastw_256(<8 x i16> %x0, <16 x i16> %x1, i16 %mask) {		define <16 x i16>@test_int_x86_avx512_pbroadcastw_256(<8 x i16> %x0, <16 x i16> %x1, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_pbroadcastw_256:		; CHECK-LABEL: test_int_x86_avx512_pbroadcastw_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpbroadcastw %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x79,0xd0]		; CHECK-NEXT: vpbroadcastw %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x79,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpbroadcastw %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x79,0xc8]		; CHECK-NEXT: vpbroadcastw %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x79,0xc8]
; CHECK-NEXT: vpbroadcastw %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x79,0xc0]		; CHECK-NEXT: vpbroadcastw %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x79,0xc0]
; CHECK-NEXT: vpaddw %ymm1, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc9]		; CHECK-NEXT: vpaddw %ymm1, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc9]
; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc1]		; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.pbroadcastw.256(<8 x i16> %x0, <16 x i16> %x1, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.pbroadcastw.256(<8 x i16> %x0, <16 x i16> %x1, i16 -1)
%res1 = call <16 x i16> @llvm.x86.avx512.pbroadcastw.256(<8 x i16> %x0, <16 x i16> %x1, i16 %mask)		%res1 = call <16 x i16> @llvm.x86.avx512.pbroadcastw.256(<8 x i16> %x0, <16 x i16> %x1, i16 %mask)
%res2 = call <16 x i16> @llvm.x86.avx512.pbroadcastw.256(<8 x i16> %x0, <16 x i16> zeroinitializer, i16 %mask)		%res2 = call <16 x i16> @llvm.x86.avx512.pbroadcastw.256(<8 x i16> %x0, <16 x i16> zeroinitializer, i16 %mask)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res2, %res3		%res4 = add <16 x i16> %res2, %res3
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.pbroadcastw.128(<8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.pbroadcastw.128(<8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_pbroadcastw_128(<8 x i16> %x0, <8 x i16> %x1, i8 %mask) {		define <8 x i16>@test_int_x86_avx512_pbroadcastw_128(<8 x i16> %x0, <8 x i16> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_pbroadcastw_128:		; CHECK-LABEL: test_int_x86_avx512_pbroadcastw_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpbroadcastw %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x79,0xd0]		; CHECK-NEXT: vpbroadcastw %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x79,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpbroadcastw %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x79,0xc8]		; CHECK-NEXT: vpbroadcastw %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x79,0xc8]
; CHECK-NEXT: vpbroadcastw %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x79,0xc0]		; CHECK-NEXT: vpbroadcastw %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x79,0xc0]
; CHECK-NEXT: vpaddw %xmm1, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc9]		; CHECK-NEXT: vpaddw %xmm1, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc9]
; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.pbroadcastw.128(<8 x i16> %x0, <8 x i16> %x1, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.pbroadcastw.128(<8 x i16> %x0, <8 x i16> %x1, i8 -1)
%res1 = call <8 x i16> @llvm.x86.avx512.pbroadcastw.128(<8 x i16> %x0, <8 x i16> %x1, i8 %mask)		%res1 = call <8 x i16> @llvm.x86.avx512.pbroadcastw.128(<8 x i16> %x0, <8 x i16> %x1, i8 %mask)
%res2 = call <8 x i16> @llvm.x86.avx512.pbroadcastw.128(<8 x i16> %x0, <8 x i16> zeroinitializer, i8 %mask)		%res2 = call <8 x i16> @llvm.x86.avx512.pbroadcastw.128(<8 x i16> %x0, <8 x i16> zeroinitializer, i8 %mask)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res2, %res3		%res4 = add <8 x i16> %res2, %res3
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}
Show All 40 Lines

declare void @llvm.x86.avx512.mask.storeu.b.128(i8*, <16 x i8>, i16)		declare void @llvm.x86.avx512.mask.storeu.b.128(i8*, <16 x i8>, i16)

define void@test_int_x86_avx512_mask_storeu_b_128(i8* %ptr1, i8* %ptr2, <16 x i8> %x1, i16 %x2) {		define void@test_int_x86_avx512_mask_storeu_b_128(i8* %ptr1, i8* %ptr2, <16 x i8> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu8 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7f,0x09,0x7f,0x07]		; CHECK-NEXT: vmovdqu8 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7f,0x09,0x7f,0x07]
; CHECK-NEXT: vmovdqu8 %xmm0, (%rsi) ## encoding: [0x62,0xf1,0x7f,0x08,0x7f,0x06]		; CHECK-NEXT: vmovdqu %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.b.128(i8* %ptr1, <16 x i8> %x1, i16 %x2)		call void @llvm.x86.avx512.mask.storeu.b.128(i8* %ptr1, <16 x i8> %x1, i16 %x2)
call void @llvm.x86.avx512.mask.storeu.b.128(i8* %ptr2, <16 x i8> %x1, i16 -1)		call void @llvm.x86.avx512.mask.storeu.b.128(i8* %ptr2, <16 x i8> %x1, i16 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.b.256(i8*, <32 x i8>, i32)		declare void @llvm.x86.avx512.mask.storeu.b.256(i8*, <32 x i8>, i32)

define void@test_int_x86_avx512_mask_storeu_b_256(i8* %ptr1, i8* %ptr2, <32 x i8> %x1, i32 %x2) {		define void@test_int_x86_avx512_mask_storeu_b_256(i8* %ptr1, i8* %ptr2, <32 x i8> %x1, i32 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edx, %k1 ## encoding: [0xc5,0xfb,0x92,0xca]		; CHECK-NEXT: kmovd %edx, %k1 ## encoding: [0xc5,0xfb,0x92,0xca]
; CHECK-NEXT: vmovdqu8 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7f,0x29,0x7f,0x07]		; CHECK-NEXT: vmovdqu8 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7f,0x29,0x7f,0x07]
; CHECK-NEXT: vmovdqu8 %ymm0, (%rsi) ## encoding: [0x62,0xf1,0x7f,0x28,0x7f,0x06]		; CHECK-NEXT: vmovdqu %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.b.256(i8* %ptr1, <32 x i8> %x1, i32 %x2)		call void @llvm.x86.avx512.mask.storeu.b.256(i8* %ptr1, <32 x i8> %x1, i32 %x2)
call void @llvm.x86.avx512.mask.storeu.b.256(i8* %ptr2, <32 x i8> %x1, i32 -1)		call void @llvm.x86.avx512.mask.storeu.b.256(i8* %ptr2, <32 x i8> %x1, i32 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.w.128(i8*, <8 x i16>, i8)		declare void @llvm.x86.avx512.mask.storeu.w.128(i8*, <8 x i16>, i8)

define void@test_int_x86_avx512_mask_storeu_w_128(i8* %ptr1, i8* %ptr2, <8 x i16> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_w_128(i8* %ptr1, i8* %ptr2, <8 x i16> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu16 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x7f,0x07]		; CHECK-NEXT: vmovdqu16 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x7f,0x07]
; CHECK-NEXT: vmovdqu16 %xmm0, (%rsi) ## encoding: [0x62,0xf1,0xff,0x08,0x7f,0x06]		; CHECK-NEXT: vmovdqu %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.w.128(i8* %ptr1, <8 x i16> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.w.128(i8* %ptr1, <8 x i16> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.w.128(i8* %ptr2, <8 x i16> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.w.128(i8* %ptr2, <8 x i16> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.w.256(i8*, <16 x i16>, i16)		declare void @llvm.x86.avx512.mask.storeu.w.256(i8*, <16 x i16>, i16)

define void@test_int_x86_avx512_mask_storeu_w_256(i8* %ptr1, i8* %ptr2, <16 x i16> %x1, i16 %x2) {		define void@test_int_x86_avx512_mask_storeu_w_256(i8* %ptr1, i8* %ptr2, <16 x i16> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu16 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0x7f,0x07]		; CHECK-NEXT: vmovdqu16 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0x7f,0x07]
; CHECK-NEXT: vmovdqu16 %ymm0, (%rsi) ## encoding: [0x62,0xf1,0xff,0x28,0x7f,0x06]		; CHECK-NEXT: vmovdqu %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.w.256(i8* %ptr1, <16 x i16> %x1, i16 %x2)		call void @llvm.x86.avx512.mask.storeu.w.256(i8* %ptr1, <16 x i16> %x1, i16 %x2)
call void @llvm.x86.avx512.mask.storeu.w.256(i8* %ptr2, <16 x i16> %x1, i16 -1)		call void @llvm.x86.avx512.mask.storeu.w.256(i8* %ptr2, <16 x i16> %x1, i16 -1)
ret void		ret void
}		}

declare <8 x i16> @llvm.x86.avx512.mask.loadu.w.128(i8*, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.loadu.w.128(i8*, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_loadu_w_128(i8* %ptr, i8* %ptr2, <8 x i16> %x1, i8 %mask) {		define <8 x i16>@test_int_x86_avx512_mask_loadu_w_128(i8* %ptr, i8* %ptr2, <8 x i16> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_loadu_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_loadu_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqu16 (%rdi), %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0x6f,0x07]		; CHECK-NEXT: vmovdqu (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x07]
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu16 (%rsi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x6f,0x06]		; CHECK-NEXT: vmovdqu16 (%rsi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x6f,0x06]
; CHECK-NEXT: vmovdqu16 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0x89,0x6f,0x0f]		; CHECK-NEXT: vmovdqu16 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0x89,0x6f,0x0f]
; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <8 x i16> @llvm.x86.avx512.mask.loadu.w.128(i8* %ptr, <8 x i16> %x1, i8 -1)		%res0 = call <8 x i16> @llvm.x86.avx512.mask.loadu.w.128(i8* %ptr, <8 x i16> %x1, i8 -1)
%res = call <8 x i16> @llvm.x86.avx512.mask.loadu.w.128(i8* %ptr2, <8 x i16> %res0, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.loadu.w.128(i8* %ptr2, <8 x i16> %res0, i8 %mask)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.loadu.w.128(i8* %ptr, <8 x i16> zeroinitializer, i8 %mask)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.loadu.w.128(i8* %ptr, <8 x i16> zeroinitializer, i8 %mask)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.loadu.w.256(i8*, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.loadu.w.256(i8*, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_loadu_w_256(i8* %ptr, i8* %ptr2, <16 x i16> %x1, i16 %mask) {		define <16 x i16>@test_int_x86_avx512_mask_loadu_w_256(i8* %ptr, i8* %ptr2, <16 x i16> %x1, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_loadu_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_loadu_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqu16 (%rdi), %ymm0 ## encoding: [0x62,0xf1,0xff,0x28,0x6f,0x07]		; CHECK-NEXT: vmovdqu (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x6f,0x07]
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu16 (%rsi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0x6f,0x06]		; CHECK-NEXT: vmovdqu16 (%rsi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0x6f,0x06]
; CHECK-NEXT: vmovdqu16 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0xa9,0x6f,0x0f]		; CHECK-NEXT: vmovdqu16 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0xa9,0x6f,0x0f]
; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc1]		; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <16 x i16> @llvm.x86.avx512.mask.loadu.w.256(i8* %ptr, <16 x i16> %x1, i16 -1)		%res0 = call <16 x i16> @llvm.x86.avx512.mask.loadu.w.256(i8* %ptr, <16 x i16> %x1, i16 -1)
%res = call <16 x i16> @llvm.x86.avx512.mask.loadu.w.256(i8* %ptr2, <16 x i16> %res0, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.loadu.w.256(i8* %ptr2, <16 x i16> %res0, i16 %mask)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.loadu.w.256(i8* %ptr, <16 x i16> zeroinitializer, i16 %mask)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.loadu.w.256(i8* %ptr, <16 x i16> zeroinitializer, i16 %mask)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.loadu.b.128(i8*, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.loadu.b.128(i8*, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_loadu_b_128(i8* %ptr, i8* %ptr2, <16 x i8> %x1, i16 %mask) {		define <16 x i8>@test_int_x86_avx512_mask_loadu_b_128(i8* %ptr, i8* %ptr2, <16 x i8> %x1, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_loadu_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_loadu_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqu8 (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x6f,0x07]		; CHECK-NEXT: vmovdqu (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x07]
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu8 (%rsi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7f,0x09,0x6f,0x06]		; CHECK-NEXT: vmovdqu8 (%rsi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7f,0x09,0x6f,0x06]
; CHECK-NEXT: vmovdqu8 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0x89,0x6f,0x0f]		; CHECK-NEXT: vmovdqu8 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0x89,0x6f,0x0f]
; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc1]		; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <16 x i8> @llvm.x86.avx512.mask.loadu.b.128(i8* %ptr, <16 x i8> %x1, i16 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.loadu.b.128(i8* %ptr, <16 x i8> %x1, i16 -1)
%res = call <16 x i8> @llvm.x86.avx512.mask.loadu.b.128(i8* %ptr2, <16 x i8> %res0, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.loadu.b.128(i8* %ptr2, <16 x i8> %res0, i16 %mask)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.loadu.b.128(i8* %ptr, <16 x i8> zeroinitializer, i16 %mask)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.loadu.b.128(i8* %ptr, <16 x i8> zeroinitializer, i16 %mask)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.loadu.b.256(i8*, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.loadu.b.256(i8*, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_loadu_b_256(i8* %ptr, i8* %ptr2, <32 x i8> %x1, i32 %mask) {		define <32 x i8>@test_int_x86_avx512_mask_loadu_b_256(i8* %ptr, i8* %ptr2, <32 x i8> %x1, i32 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_loadu_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_loadu_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqu8 (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7f,0x28,0x6f,0x07]		; CHECK-NEXT: vmovdqu (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x6f,0x07]
; CHECK-NEXT: kmovd %edx, %k1 ## encoding: [0xc5,0xfb,0x92,0xca]		; CHECK-NEXT: kmovd %edx, %k1 ## encoding: [0xc5,0xfb,0x92,0xca]
; CHECK-NEXT: vmovdqu8 (%rsi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7f,0x29,0x6f,0x06]		; CHECK-NEXT: vmovdqu8 (%rsi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7f,0x29,0x6f,0x06]
; CHECK-NEXT: vmovdqu8 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0xa9,0x6f,0x0f]		; CHECK-NEXT: vmovdqu8 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0xa9,0x6f,0x0f]
; CHECK-NEXT: vpaddb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfc,0xc1]		; CHECK-NEXT: vpaddb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfc,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <32 x i8> @llvm.x86.avx512.mask.loadu.b.256(i8* %ptr, <32 x i8> %x1, i32 -1)		%res0 = call <32 x i8> @llvm.x86.avx512.mask.loadu.b.256(i8* %ptr, <32 x i8> %x1, i32 -1)
%res = call <32 x i8> @llvm.x86.avx512.mask.loadu.b.256(i8* %ptr2, <32 x i8> %res0, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.loadu.b.256(i8* %ptr2, <32 x i8> %res0, i32 %mask)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.loadu.b.256(i8* %ptr, <32 x i8> zeroinitializer, i32 %mask)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.loadu.b.256(i8* %ptr, <32 x i8> zeroinitializer, i32 %mask)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.palignr.128(<16 x i8>, <16 x i8>, i32, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.palignr.128(<16 x i8>, <16 x i8>, i32, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_palignr_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x3, i16 %x4) {		define <16 x i8>@test_int_x86_avx512_mask_palignr_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x3, i16 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_palignr_128:		; CHECK-LABEL: test_int_x86_avx512_mask_palignr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpalignr $2, %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf3,0x7d,0x08,0x0f,0xd9,0x02]		; CHECK-NEXT: vpalignr $2, %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x0f,0xd9,0x02]
; CHECK-NEXT: ## xmm3 = xmm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],xmm0[0,1]		; CHECK-NEXT: ## xmm3 = xmm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],xmm0[0,1]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpalignr $2, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x0f,0xd1,0x02]		; CHECK-NEXT: vpalignr $2, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x0f,0xd1,0x02]
; CHECK-NEXT: ## xmm2 {%k1} = xmm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],xmm0[0,1]		; CHECK-NEXT: ## xmm2 {%k1} = xmm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],xmm0[0,1]
; CHECK-NEXT: vpalignr $2, %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x0f,0xc1,0x02]		; CHECK-NEXT: vpalignr $2, %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x0f,0xc1,0x02]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],xmm0[0,1]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],xmm0[0,1]
; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
; CHECK-NEXT: vpaddb %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc3]		; CHECK-NEXT: vpaddb %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.palignr.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <16 x i8> %x3, i16 %x4)		%res = call <16 x i8> @llvm.x86.avx512.mask.palignr.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <16 x i8> %x3, i16 %x4)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.palignr.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <16 x i8> zeroinitializer, i16 %x4)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.palignr.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <16 x i8> zeroinitializer, i16 %x4)
%res2 = call <16 x i8> @llvm.x86.avx512.mask.palignr.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <16 x i8> %x3, i16 -1)		%res2 = call <16 x i8> @llvm.x86.avx512.mask.palignr.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <16 x i8> %x3, i16 -1)
%res3 = add <16 x i8> %res, %res1		%res3 = add <16 x i8> %res, %res1
%res4 = add <16 x i8> %res3, %res2		%res4 = add <16 x i8> %res3, %res2
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}

declare <32 x i8> @llvm.x86.avx512.mask.palignr.256(<32 x i8>, <32 x i8>, i32, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.palignr.256(<32 x i8>, <32 x i8>, i32, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_palignr_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x3, i32 %x4) {		define <32 x i8>@test_int_x86_avx512_mask_palignr_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x3, i32 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_palignr_256:		; CHECK-LABEL: test_int_x86_avx512_mask_palignr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpalignr $2, %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf3,0x7d,0x28,0x0f,0xd9,0x02]		; CHECK-NEXT: vpalignr $2, %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x7d,0x0f,0xd9,0x02]
; CHECK-NEXT: ## ymm3 = ymm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],ymm0[0,1],ymm1[18,19,20,21,22,23,24,25,26,27,28,29,30,31],ymm0[16,17]		; CHECK-NEXT: ## ymm3 = ymm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],ymm0[0,1],ymm1[18,19,20,21,22,23,24,25,26,27,28,29,30,31],ymm0[16,17]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpalignr $2, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x0f,0xd1,0x02]		; CHECK-NEXT: vpalignr $2, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x0f,0xd1,0x02]
; CHECK-NEXT: ## ymm2 {%k1} = ymm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],ymm0[0,1],ymm1[18,19,20,21,22,23,24,25,26,27,28,29,30,31],ymm0[16,17]		; CHECK-NEXT: ## ymm2 {%k1} = ymm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],ymm0[0,1],ymm1[18,19,20,21,22,23,24,25,26,27,28,29,30,31],ymm0[16,17]
; CHECK-NEXT: vpalignr $2, %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x0f,0xc1,0x02]		; CHECK-NEXT: vpalignr $2, %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x0f,0xc1,0x02]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],ymm0[0,1],ymm1[18,19,20,21,22,23,24,25,26,27,28,29,30,31],ymm0[16,17]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm1[2,3,4,5,6,7,8,9,10,11,12,13,14,15],ymm0[0,1],ymm1[18,19,20,21,22,23,24,25,26,27,28,29,30,31],ymm0[16,17]
; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc0]		; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc0]
; CHECK-NEXT: vpaddb %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfc,0xc3]		; CHECK-NEXT: vpaddb %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.palignr.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <32 x i8> %x3, i32 %x4)		%res = call <32 x i8> @llvm.x86.avx512.mask.palignr.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <32 x i8> %x3, i32 %x4)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.palignr.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <32 x i8> zeroinitializer, i32 %x4)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.palignr.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <32 x i8> zeroinitializer, i32 %x4)
%res2 = call <32 x i8> @llvm.x86.avx512.mask.palignr.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <32 x i8> %x3, i32 -1)		%res2 = call <32 x i8> @llvm.x86.avx512.mask.palignr.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <32 x i8> %x3, i32 -1)
%res3 = add <32 x i8> %res, %res1		%res3 = add <32 x i8> %res, %res1
%res4 = add <32 x i8> %res3, %res2		%res4 = add <32 x i8> %res3, %res2
ret <32 x i8> %res4		ret <32 x i8> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pshufh.w.128(<8 x i16>, i32, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pshufh.w.128(<8 x i16>, i32, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pshufh_w_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pshufh_w_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pshufh_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pshufh_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpshufhw $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x7e,0x08,0x70,0xd0,0x03]		; CHECK-NEXT: vpshufhw $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x70,0xd0,0x03]
; CHECK-NEXT: ## xmm2 = xmm0[0,1,2,3,7,4,4,4]		; CHECK-NEXT: ## xmm2 = xmm0[0,1,2,3,7,4,4,4]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpshufhw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x70,0xc8,0x03]		; CHECK-NEXT: vpshufhw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x70,0xc8,0x03]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0,1,2,3,7,4,4,4]		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0,1,2,3,7,4,4,4]
; CHECK-NEXT: vpshufhw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x70,0xc0,0x03]		; CHECK-NEXT: vpshufhw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x70,0xc0,0x03]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0,1,2,3,7,4,4,4]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0,1,2,3,7,4,4,4]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc2]		; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pshufh.w.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pshufh.w.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pshufh.w.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pshufh.w.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.pshufh.w.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.pshufh.w.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pshufh.w.256(<16 x i16>, i32, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pshufh.w.256(<16 x i16>, i32, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pshufh_w_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_pshufh_w_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pshufh_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pshufh_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpshufhw $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x7e,0x28,0x70,0xd0,0x03]		; CHECK-NEXT: vpshufhw $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x70,0xd0,0x03]
; CHECK-NEXT: ## ymm2 = ymm0[0,1,2,3,7,4,4,4,8,9,10,11,15,12,12,12]		; CHECK-NEXT: ## ymm2 = ymm0[0,1,2,3,7,4,4,4,8,9,10,11,15,12,12,12]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpshufhw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x70,0xc8,0x03]		; CHECK-NEXT: vpshufhw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x70,0xc8,0x03]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,2,3,7,4,4,4,8,9,10,11,15,12,12,12]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,2,3,7,4,4,4,8,9,10,11,15,12,12,12]
; CHECK-NEXT: vpshufhw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x70,0xc0,0x03]		; CHECK-NEXT: vpshufhw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x70,0xc0,0x03]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[0,1,2,3,7,4,4,4,8,9,10,11,15,12,12,12]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[0,1,2,3,7,4,4,4,8,9,10,11,15,12,12,12]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc2]		; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pshufh.w.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.pshufh.w.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pshufh.w.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pshufh.w.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.pshufh.w.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.pshufh.w.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pshufl.w.128(<8 x i16>, i32, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pshufl.w.128(<8 x i16>, i32, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pshufl_w_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pshufl_w_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pshufl_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pshufl_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpshuflw $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x7f,0x08,0x70,0xd0,0x03]		; CHECK-NEXT: vpshuflw $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x70,0xd0,0x03]
; CHECK-NEXT: ## xmm2 = xmm0[3,0,0,0,4,5,6,7]		; CHECK-NEXT: ## xmm2 = xmm0[3,0,0,0,4,5,6,7]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpshuflw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7f,0x09,0x70,0xc8,0x03]		; CHECK-NEXT: vpshuflw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7f,0x09,0x70,0xc8,0x03]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[3,0,0,0,4,5,6,7]		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[3,0,0,0,4,5,6,7]
; CHECK-NEXT: vpshuflw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0x89,0x70,0xc0,0x03]		; CHECK-NEXT: vpshuflw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0x89,0x70,0xc0,0x03]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[3,0,0,0,4,5,6,7]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[3,0,0,0,4,5,6,7]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc2]		; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pshufl.w.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pshufl.w.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pshufl.w.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pshufl.w.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.pshufl.w.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.pshufl.w.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pshufl.w.256(<16 x i16>, i32, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pshufl.w.256(<16 x i16>, i32, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pshufl_w_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_pshufl_w_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pshufl_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pshufl_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpshuflw $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x7f,0x28,0x70,0xd0,0x03]		; CHECK-NEXT: vpshuflw $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xff,0x70,0xd0,0x03]
; CHECK-NEXT: ## ymm2 = ymm0[3,0,0,0,4,5,6,7,11,8,8,8,12,13,14,15]		; CHECK-NEXT: ## ymm2 = ymm0[3,0,0,0,4,5,6,7,11,8,8,8,12,13,14,15]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpshuflw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7f,0x29,0x70,0xc8,0x03]		; CHECK-NEXT: vpshuflw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7f,0x29,0x70,0xc8,0x03]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[3,0,0,0,4,5,6,7,11,8,8,8,12,13,14,15]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[3,0,0,0,4,5,6,7,11,8,8,8,12,13,14,15]
; CHECK-NEXT: vpshuflw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0xa9,0x70,0xc0,0x03]		; CHECK-NEXT: vpshuflw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0xa9,0x70,0xc0,0x03]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[3,0,0,0,4,5,6,7,11,8,8,8,12,13,14,15]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[3,0,0,0,4,5,6,7,11,8,8,8,12,13,14,15]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc2]		; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pshufl.w.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.pshufl.w.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pshufl.w.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pshufl.w.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.pshufl.w.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.pshufl.w.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines

declare i16 @llvm.x86.avx512.mask.pcmpgt.w.256(<16 x i16>, <16 x i16>, i16)		declare i16 @llvm.x86.avx512.mask.pcmpgt.w.256(<16 x i16>, <16 x i16>, i16)

declare <16 x i8> @llvm.x86.avx512.mask.punpckhb.w.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.punpckhb.w.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_punpckhb_w_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {		define <16 x i8>@test_int_x86_avx512_mask_punpckhb_w_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckhb_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckhb_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckhbw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0x68,0xd9]		; CHECK-NEXT: vpunpckhbw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x68,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[8],xmm1[8],xmm0[9],xmm1[9],xmm0[10],xmm1[10],xmm0[11],xmm1[11],xmm0[12],xmm1[12],xmm0[13],xmm1[13],xmm0[14],xmm1[14],xmm0[15],xmm1[15]		; CHECK-NEXT: ## xmm3 = xmm0[8],xmm1[8],xmm0[9],xmm1[9],xmm0[10],xmm1[10],xmm0[11],xmm1[11],xmm0[12],xmm1[12],xmm0[13],xmm1[13],xmm0[14],xmm1[14],xmm0[15],xmm1[15]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckhbw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x68,0xd1]		; CHECK-NEXT: vpunpckhbw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x68,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[8],xmm1[8],xmm0[9],xmm1[9],xmm0[10],xmm1[10],xmm0[11],xmm1[11],xmm0[12],xmm1[12],xmm0[13],xmm1[13],xmm0[14],xmm1[14],xmm0[15],xmm1[15]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[8],xmm1[8],xmm0[9],xmm1[9],xmm0[10],xmm1[10],xmm0[11],xmm1[11],xmm0[12],xmm1[12],xmm0[13],xmm1[13],xmm0[14],xmm1[14],xmm0[15],xmm1[15]
; CHECK-NEXT: vpaddb %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc3]		; CHECK-NEXT: vpaddb %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.punpckhb.w.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)		%res = call <16 x i8> @llvm.x86.avx512.mask.punpckhb.w.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.punpckhb.w.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.punpckhb.w.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.punpcklb.w.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.punpcklb.w.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_punpcklb_w_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {		define <16 x i8>@test_int_x86_avx512_mask_punpcklb_w_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpcklb_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_punpcklb_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpcklbw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0x60,0xd9]		; CHECK-NEXT: vpunpcklbw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x60,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3],xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]		; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3],xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpcklbw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x60,0xd1]		; CHECK-NEXT: vpunpcklbw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x60,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3],xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3],xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]
; CHECK-NEXT: vpaddb %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc3]		; CHECK-NEXT: vpaddb %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.punpcklb.w.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)		%res = call <16 x i8> @llvm.x86.avx512.mask.punpcklb.w.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.punpcklb.w.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.punpcklb.w.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.punpckhb.w.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.punpckhb.w.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_punpckhb_w_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {		define <32 x i8>@test_int_x86_avx512_mask_punpckhb_w_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckhb_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckhb_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckhbw %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0x68,0xd9]		; CHECK-NEXT: vpunpckhbw %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x68,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[8],ymm1[8],ymm0[9],ymm1[9],ymm0[10],ymm1[10],ymm0[11],ymm1[11],ymm0[12],ymm1[12],ymm0[13],ymm1[13],ymm0[14],ymm1[14],ymm0[15],ymm1[15],ymm0[24],ymm1[24],ymm0[25],ymm1[25],ymm0[26],ymm1[26],ymm0[27],ymm1[27],ymm0[28],ymm1[28],ymm0[29],ymm1[29],ymm0[30],ymm1[30],ymm0[31],ymm1[31]		; CHECK-NEXT: ## ymm3 = ymm0[8],ymm1[8],ymm0[9],ymm1[9],ymm0[10],ymm1[10],ymm0[11],ymm1[11],ymm0[12],ymm1[12],ymm0[13],ymm1[13],ymm0[14],ymm1[14],ymm0[15],ymm1[15],ymm0[24],ymm1[24],ymm0[25],ymm1[25],ymm0[26],ymm1[26],ymm0[27],ymm1[27],ymm0[28],ymm1[28],ymm0[29],ymm1[29],ymm0[30],ymm1[30],ymm0[31],ymm1[31]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpunpckhbw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x68,0xd1]		; CHECK-NEXT: vpunpckhbw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x68,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[8],ymm1[8],ymm0[9],ymm1[9],ymm0[10],ymm1[10],ymm0[11],ymm1[11],ymm0[12],ymm1[12],ymm0[13],ymm1[13],ymm0[14],ymm1[14],ymm0[15],ymm1[15],ymm0[24],ymm1[24],ymm0[25],ymm1[25],ymm0[26],ymm1[26],ymm0[27],ymm1[27],ymm0[28],ymm1[28],ymm0[29],ymm1[29],ymm0[30],ymm1[30],ymm0[31],ymm1[31]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[8],ymm1[8],ymm0[9],ymm1[9],ymm0[10],ymm1[10],ymm0[11],ymm1[11],ymm0[12],ymm1[12],ymm0[13],ymm1[13],ymm0[14],ymm1[14],ymm0[15],ymm1[15],ymm0[24],ymm1[24],ymm0[25],ymm1[25],ymm0[26],ymm1[26],ymm0[27],ymm1[27],ymm0[28],ymm1[28],ymm0[29],ymm1[29],ymm0[30],ymm1[30],ymm0[31],ymm1[31]
; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc3]		; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.punpckhb.w.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)		%res = call <32 x i8> @llvm.x86.avx512.mask.punpckhb.w.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.punpckhb.w.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.punpckhb.w.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.punpcklb.w.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.punpcklb.w.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_punpcklb_w_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {		define <32 x i8>@test_int_x86_avx512_mask_punpcklb_w_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpcklb_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_punpcklb_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpcklbw %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0x60,0xd9]		; CHECK-NEXT: vpunpcklbw %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x60,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[4],ymm1[4],ymm0[5],ymm1[5],ymm0[6],ymm1[6],ymm0[7],ymm1[7],ymm0[16],ymm1[16],ymm0[17],ymm1[17],ymm0[18],ymm1[18],ymm0[19],ymm1[19],ymm0[20],ymm1[20],ymm0[21],ymm1[21],ymm0[22],ymm1[22],ymm0[23],ymm1[23]		; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[4],ymm1[4],ymm0[5],ymm1[5],ymm0[6],ymm1[6],ymm0[7],ymm1[7],ymm0[16],ymm1[16],ymm0[17],ymm1[17],ymm0[18],ymm1[18],ymm0[19],ymm1[19],ymm0[20],ymm1[20],ymm0[21],ymm1[21],ymm0[22],ymm1[22],ymm0[23],ymm1[23]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpunpcklbw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x60,0xd1]		; CHECK-NEXT: vpunpcklbw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x60,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[4],ymm1[4],ymm0[5],ymm1[5],ymm0[6],ymm1[6],ymm0[7],ymm1[7],ymm0[16],ymm1[16],ymm0[17],ymm1[17],ymm0[18],ymm1[18],ymm0[19],ymm1[19],ymm0[20],ymm1[20],ymm0[21],ymm1[21],ymm0[22],ymm1[22],ymm0[23],ymm1[23]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[4],ymm1[4],ymm0[5],ymm1[5],ymm0[6],ymm1[6],ymm0[7],ymm1[7],ymm0[16],ymm1[16],ymm0[17],ymm1[17],ymm0[18],ymm1[18],ymm0[19],ymm1[19],ymm0[20],ymm1[20],ymm0[21],ymm1[21],ymm0[22],ymm1[22],ymm0[23],ymm1[23]
; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc3]		; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.punpcklb.w.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)		%res = call <32 x i8> @llvm.x86.avx512.mask.punpcklb.w.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.punpcklb.w.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.punpcklb.w.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.punpcklw.d.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.punpcklw.d.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_punpcklw_d_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_punpcklw_d_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpcklw_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_punpcklw_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpcklwd %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0x61,0xd9]		; CHECK-NEXT: vpunpcklwd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x61,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3]		; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpcklwd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x61,0xd1]		; CHECK-NEXT: vpunpcklwd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x61,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.punpcklw.d.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.punpcklw.d.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.punpcklw.d.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.punpcklw.d.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.punpckhw.d.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.punpckhw.d.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_punpckhw_d_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_punpckhw_d_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckhw_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckhw_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckhwd %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0x69,0xd9]		; CHECK-NEXT: vpunpckhwd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x69,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]		; CHECK-NEXT: ## xmm3 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckhwd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x69,0xd1]		; CHECK-NEXT: vpunpckhwd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x69,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.punpckhw.d.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.punpckhw.d.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.punpckhw.d.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.punpckhw.d.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.punpcklw.d.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.punpcklw.d.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_punpcklw_d_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_punpcklw_d_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpcklw_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_punpcklw_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpcklwd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0x61,0xd9]		; CHECK-NEXT: vpunpcklwd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x61,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[8],ymm1[8],ymm0[9],ymm1[9],ymm0[10],ymm1[10],ymm0[11],ymm1[11]		; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[8],ymm1[8],ymm0[9],ymm1[9],ymm0[10],ymm1[10],ymm0[11],ymm1[11]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpcklwd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x61,0xd1]		; CHECK-NEXT: vpunpcklwd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x61,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[8],ymm1[8],ymm0[9],ymm1[9],ymm0[10],ymm1[10],ymm0[11],ymm1[11]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[8],ymm1[8],ymm0[9],ymm1[9],ymm0[10],ymm1[10],ymm0[11],ymm1[11]
; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc3]		; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.punpcklw.d.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.punpcklw.d.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.punpcklw.d.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.punpcklw.d.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.punpckhw.d.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.punpckhw.d.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_punpckhw_d_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_punpckhw_d_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckhw_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckhw_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckhwd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0x69,0xd9]		; CHECK-NEXT: vpunpckhwd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x69,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[4],ymm1[4],ymm0[5],ymm1[5],ymm0[6],ymm1[6],ymm0[7],ymm1[7],ymm0[12],ymm1[12],ymm0[13],ymm1[13],ymm0[14],ymm1[14],ymm0[15],ymm1[15]		; CHECK-NEXT: ## ymm3 = ymm0[4],ymm1[4],ymm0[5],ymm1[5],ymm0[6],ymm1[6],ymm0[7],ymm1[7],ymm0[12],ymm1[12],ymm0[13],ymm1[13],ymm0[14],ymm1[14],ymm0[15],ymm1[15]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckhwd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x69,0xd1]		; CHECK-NEXT: vpunpckhwd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x69,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[4],ymm1[4],ymm0[5],ymm1[5],ymm0[6],ymm1[6],ymm0[7],ymm1[7],ymm0[12],ymm1[12],ymm0[13],ymm1[13],ymm0[14],ymm1[14],ymm0[15],ymm1[15]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[4],ymm1[4],ymm0[5],ymm1[5],ymm0[6],ymm1[6],ymm0[7],ymm1[7],ymm0[12],ymm1[12],ymm0[13],ymm1[13],ymm0[14],ymm1[14],ymm0[15],ymm1[15]
; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc3]		; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.punpckhw.d.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.punpckhw.d.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.punpckhw.d.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.punpckhw.d.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

define <8 x i16> @test_mask_add_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @test_mask_add_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_add_epi16_rr_128:		; CHECK-LABEL: test_mask_add_epi16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_add_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_add_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi16_rrk_128:		; CHECK-LABEL: test_mask_add_epi16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfd,0xd1]		; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfd,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_add_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {		define <8 x i16> @test_mask_add_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi16_rrkz_128:		; CHECK-LABEL: test_mask_add_epi16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_add_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <8 x i16> @test_mask_add_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_add_epi16_rm_128:		; CHECK-LABEL: test_mask_add_epi16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0x07]		; CHECK-NEXT: vpaddw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_add_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_add_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi16_rmk_128:		; CHECK-LABEL: test_mask_add_epi16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfd,0x0f]		; CHECK-NEXT: vpaddw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfd,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_add_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_add_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi16_rmkz_128:		; CHECK-LABEL: test_mask_add_epi16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xfd,0x07]		; CHECK-NEXT: vpaddw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xfd,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.padd.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <16 x i16> @test_mask_add_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <16 x i16> @test_mask_add_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_add_epi16_rr_256:		; CHECK-LABEL: test_mask_add_epi16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc1]		; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_add_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_add_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_add_epi16_rrk_256:		; CHECK-LABEL: test_mask_add_epi16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfd,0xd1]		; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfd,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_add_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {		define <16 x i16> @test_mask_add_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_add_epi16_rrkz_256:		; CHECK-LABEL: test_mask_add_epi16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xfd,0xc1]		; CHECK-NEXT: vpaddw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_add_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <16 x i16> @test_mask_add_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_add_epi16_rm_256:		; CHECK-LABEL: test_mask_add_epi16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0x07]		; CHECK-NEXT: vpaddw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_add_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_add_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_add_epi16_rmk_256:		; CHECK-LABEL: test_mask_add_epi16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfd,0x0f]		; CHECK-NEXT: vpaddw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfd,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_add_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_add_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_add_epi16_rmkz_256:		; CHECK-LABEL: test_mask_add_epi16_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xfd,0x07]		; CHECK-NEXT: vpaddw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xfd,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

declare <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.padd.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <8 x i16> @test_mask_sub_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @test_mask_sub_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_sub_epi16_rr_128:		; CHECK-LABEL: test_mask_sub_epi16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf9,0xc1]		; CHECK-NEXT: vpsubw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_sub_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_sub_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi16_rrk_128:		; CHECK-LABEL: test_mask_sub_epi16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf9,0xd1]		; CHECK-NEXT: vpsubw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf9,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_sub_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {		define <8 x i16> @test_mask_sub_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi16_rrkz_128:		; CHECK-LABEL: test_mask_sub_epi16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xf9,0xc1]		; CHECK-NEXT: vpsubw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xf9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_sub_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <8 x i16> @test_mask_sub_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_sub_epi16_rm_128:		; CHECK-LABEL: test_mask_sub_epi16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf9,0x07]		; CHECK-NEXT: vpsubw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_sub_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_sub_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi16_rmk_128:		; CHECK-LABEL: test_mask_sub_epi16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf9,0x0f]		; CHECK-NEXT: vpsubw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf9,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_sub_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_sub_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi16_rmkz_128:		; CHECK-LABEL: test_mask_sub_epi16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xf9,0x07]		; CHECK-NEXT: vpsubw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xf9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psub.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <16 x i16> @test_mask_sub_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <16 x i16> @test_mask_sub_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_sub_epi16_rr_256:		; CHECK-LABEL: test_mask_sub_epi16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xf9,0xc1]		; CHECK-NEXT: vpsubw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_sub_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_sub_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_sub_epi16_rrk_256:		; CHECK-LABEL: test_mask_sub_epi16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf9,0xd1]		; CHECK-NEXT: vpsubw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf9,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_sub_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {		define <16 x i16> @test_mask_sub_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_sub_epi16_rrkz_256:		; CHECK-LABEL: test_mask_sub_epi16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xf9,0xc1]		; CHECK-NEXT: vpsubw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xf9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_sub_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <16 x i16> @test_mask_sub_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_sub_epi16_rm_256:		; CHECK-LABEL: test_mask_sub_epi16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xf9,0x07]		; CHECK-NEXT: vpsubw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_sub_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_sub_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_sub_epi16_rmk_256:		; CHECK-LABEL: test_mask_sub_epi16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf9,0x0f]		; CHECK-NEXT: vpsubw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf9,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psub.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_sub_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_sub_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_sub_epi16_rmkz_256:		; CHECK-LABEL: test_mask_sub_epi16_rmkz_256:
▲ Show 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <32 x i16> %res		ret <32 x i16> %res
}		}

declare <32 x i16> @llvm.x86.avx512.mask.pmull.w.512(<32 x i16>, <32 x i16>, <32 x i16>, i32)		declare <32 x i16> @llvm.x86.avx512.mask.pmull.w.512(<32 x i16>, <32 x i16>, <32 x i16>, i32)

define <8 x i16> @test_mask_mullo_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @test_mask_mullo_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_mullo_epi16_rr_128:		; CHECK-LABEL: test_mask_mullo_epi16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmullw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd5,0xc1]		; CHECK-NEXT: vpmullw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd5,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_mullo_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_mullo_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi16_rrk_128:		; CHECK-LABEL: test_mask_mullo_epi16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmullw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd5,0xd1]		; CHECK-NEXT: vpmullw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd5,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_mullo_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {		define <8 x i16> @test_mask_mullo_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi16_rrkz_128:		; CHECK-LABEL: test_mask_mullo_epi16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmullw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd5,0xc1]		; CHECK-NEXT: vpmullw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd5,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_mullo_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <8 x i16> @test_mask_mullo_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_mullo_epi16_rm_128:		; CHECK-LABEL: test_mask_mullo_epi16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmullw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd5,0x07]		; CHECK-NEXT: vpmullw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd5,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_mullo_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_mullo_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi16_rmk_128:		; CHECK-LABEL: test_mask_mullo_epi16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmullw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd5,0x0f]		; CHECK-NEXT: vpmullw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd5,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_mullo_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_mullo_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi16_rmkz_128:		; CHECK-LABEL: test_mask_mullo_epi16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmullw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd5,0x07]		; CHECK-NEXT: vpmullw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd5,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmull.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <16 x i16> @test_mask_mullo_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <16 x i16> @test_mask_mullo_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_mullo_epi16_rr_256:		; CHECK-LABEL: test_mask_mullo_epi16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmullw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd5,0xc1]		; CHECK-NEXT: vpmullw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd5,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_mullo_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_mullo_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_mullo_epi16_rrk_256:		; CHECK-LABEL: test_mask_mullo_epi16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmullw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd5,0xd1]		; CHECK-NEXT: vpmullw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd5,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_mullo_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {		define <16 x i16> @test_mask_mullo_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_mullo_epi16_rrkz_256:		; CHECK-LABEL: test_mask_mullo_epi16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmullw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd5,0xc1]		; CHECK-NEXT: vpmullw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd5,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_mullo_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <16 x i16> @test_mask_mullo_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_mullo_epi16_rm_256:		; CHECK-LABEL: test_mask_mullo_epi16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmullw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd5,0x07]		; CHECK-NEXT: vpmullw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd5,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_mullo_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_mullo_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_mullo_epi16_rmk_256:		; CHECK-LABEL: test_mask_mullo_epi16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmullw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd5,0x0f]		; CHECK-NEXT: vpmullw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd5,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmull.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_mullo_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_mullo_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_mullo_epi16_rmkz_256:		; CHECK-LABEL: test_mask_mullo_epi16_rmkz_256:
Show All 11 Lines
declare <16 x i8> @llvm.x86.avx512.mask.pmaxs.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.pmaxs.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_pmaxs_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask) {		define <16 x i8>@test_int_x86_avx512_mask_pmaxs_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxsb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3c,0xd1]		; CHECK-NEXT: vpmaxsb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3c,0xd1]
; CHECK-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x3c,0xc1]		; CHECK-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x3c,0xc1]
; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.pmaxs.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2 ,i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.pmaxs.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2 ,i16 %mask)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmaxs.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %mask)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmaxs.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %mask)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.pmaxs.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.pmaxs.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_pmaxs_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {		define <32 x i8>@test_int_x86_avx512_mask_pmaxs_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmaxsb %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x3c,0xd9]		; CHECK-NEXT: vpmaxsb %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3c,0xd9]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpmaxsb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3c,0xd1]		; CHECK-NEXT: vpmaxsb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3c,0xd1]
; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc3]		; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.pmaxs.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)		%res = call <32 x i8> @llvm.x86.avx512.mask.pmaxs.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.pmaxs.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.pmaxs.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmaxs.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmaxs.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmaxs_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pmaxs_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmaxsw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0xee,0xd9]		; CHECK-NEXT: vpmaxsw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xee,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xee,0xd1]		; CHECK-NEXT: vpmaxsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xee,0xd1]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmaxs.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmaxs.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmaxs.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmaxs.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmaxs.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmaxs.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmaxs_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask) {		define <16 x i16>@test_int_x86_avx512_mask_pmaxs_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xee,0xd1]		; CHECK-NEXT: vpmaxsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xee,0xd1]
; CHECK-NEXT: vpmaxsw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xee,0xc1]		; CHECK-NEXT: vpmaxsw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xee,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmaxs.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmaxs.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmaxs.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %mask)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmaxs.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %mask)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pmaxu.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.pmaxu.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_pmaxu_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2,i16 %mask) {		define <16 x i8>@test_int_x86_avx512_mask_pmaxu_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2,i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxub %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xde,0xd1]		; CHECK-NEXT: vpmaxub %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xde,0xd1]
; CHECK-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xde,0xc1]		; CHECK-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xde,0xc1]
; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.pmaxu.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.pmaxu.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmaxu.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %mask)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmaxu.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %mask)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.pmaxu.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.pmaxu.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_pmaxu_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {		define <32 x i8>@test_int_x86_avx512_mask_pmaxu_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmaxub %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0xde,0xd9]		; CHECK-NEXT: vpmaxub %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xde,0xd9]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpmaxub %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xde,0xd1]		; CHECK-NEXT: vpmaxub %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xde,0xd1]
; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc3]		; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.pmaxu.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)		%res = call <32 x i8> @llvm.x86.avx512.mask.pmaxu.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.pmaxu.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.pmaxu.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmaxu.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmaxu.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmaxu_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pmaxu_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmaxuw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0x7d,0x08,0x3e,0xd9]		; CHECK-NEXT: vpmaxuw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3e,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxuw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3e,0xd1]		; CHECK-NEXT: vpmaxuw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3e,0xd1]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmaxu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmaxu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmaxu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmaxu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmaxu.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmaxu.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmaxu_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask) {		define <16 x i16>@test_int_x86_avx512_mask_pmaxu_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxuw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3e,0xd1]		; CHECK-NEXT: vpmaxuw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3e,0xd1]
; CHECK-NEXT: vpmaxuw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x3e,0xc1]		; CHECK-NEXT: vpmaxuw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x3e,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmaxu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmaxu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmaxu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %mask)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmaxu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %mask)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pmins.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.pmins.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_pmins_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask) {		define <16 x i8>@test_int_x86_avx512_mask_pmins_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmins_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmins_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminsb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x38,0xd1]		; CHECK-NEXT: vpminsb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x38,0xd1]
; CHECK-NEXT: vpminsb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x38,0xc1]		; CHECK-NEXT: vpminsb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x38,0xc1]
; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.pmins.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.pmins.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmins.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %mask)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmins.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %mask)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.pmins.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.pmins.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_pmins_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {		define <32 x i8>@test_int_x86_avx512_mask_pmins_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmins_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmins_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpminsb %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x38,0xd9]		; CHECK-NEXT: vpminsb %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x38,0xd9]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpminsb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x38,0xd1]		; CHECK-NEXT: vpminsb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x38,0xd1]
; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc3]		; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.pmins.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)		%res = call <32 x i8> @llvm.x86.avx512.mask.pmins.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.pmins.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.pmins.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmins.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmins.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmins_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pmins_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmins_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmins_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpminsw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0xea,0xd9]		; CHECK-NEXT: vpminsw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xea,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xea,0xd1]		; CHECK-NEXT: vpminsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xea,0xd1]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmins.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmins.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmins.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmins.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmins.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmins.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmins_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask) {		define <16 x i16>@test_int_x86_avx512_mask_pmins_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmins_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmins_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xea,0xd1]		; CHECK-NEXT: vpminsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xea,0xd1]
; CHECK-NEXT: vpminsw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xea,0xc1]		; CHECK-NEXT: vpminsw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xea,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmins.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmins.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmins.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %mask)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmins.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %mask)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pminu.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.pminu.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_pminu_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask) {		define <16 x i8>@test_int_x86_avx512_mask_pminu_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pminu_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pminu_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminub %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xda,0xd1]		; CHECK-NEXT: vpminub %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xda,0xd1]
; CHECK-NEXT: vpminub %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xda,0xc1]		; CHECK-NEXT: vpminub %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xda,0xc1]
; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.pminu.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.pminu.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %mask)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pminu.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %mask)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pminu.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %mask)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.pminu.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.pminu.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_pminu_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {		define <32 x i8>@test_int_x86_avx512_mask_pminu_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pminu_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pminu_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpminub %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0xda,0xd9]		; CHECK-NEXT: vpminub %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xda,0xd9]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpminub %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xda,0xd1]		; CHECK-NEXT: vpminub %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xda,0xd1]
; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc3]		; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.pminu.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)		%res = call <32 x i8> @llvm.x86.avx512.mask.pminu.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.pminu.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.pminu.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pminu.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pminu.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pminu_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pminu_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pminu_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pminu_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpminuw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0x7d,0x08,0x3a,0xd9]		; CHECK-NEXT: vpminuw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3a,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminuw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3a,0xd1]		; CHECK-NEXT: vpminuw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3a,0xd1]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pminu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pminu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pminu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pminu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pminu.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pminu.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pminu_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask) {		define <16 x i16>@test_int_x86_avx512_mask_pminu_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pminu_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pminu_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminuw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3a,0xd1]		; CHECK-NEXT: vpminuw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3a,0xd1]
; CHECK-NEXT: vpminuw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x3a,0xc1]		; CHECK-NEXT: vpminuw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x3a,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pminu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.pminu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %mask)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pminu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %mask)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pminu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %mask)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psrl.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psrl.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psrl_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psrl_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0xd1,0xd9]		; CHECK-NEXT: vpsrlw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd1,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd1,0xd1]		; CHECK-NEXT: vpsrlw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd1,0xd1]
; CHECK-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd1,0xc1]		; CHECK-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd1,0xc1]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xcb]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xcb]
; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psrl.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psrl.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrl.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrl.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrl.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrl.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res2, %res3		%res4 = add <8 x i16> %res2, %res3
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psrl.w.256(<16 x i16>, <8 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psrl.w.256(<16 x i16>, <8 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psrl_w_256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psrl_w_256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlw %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0xd1,0xd9]		; CHECK-NEXT: vpsrlw %xmm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd1,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlw %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd1,0xd1]		; CHECK-NEXT: vpsrlw %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd1,0xd1]
; CHECK-NEXT: vpsrlw %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd1,0xc1]		; CHECK-NEXT: vpsrlw %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd1,0xc1]
; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xcb]		; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xcb]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psrl.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psrl.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psrl.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psrl.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psrl.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psrl.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psra.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psra.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psra_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psra_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0xe1,0xd9]		; CHECK-NEXT: vpsraw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe1,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsraw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe1,0xd1]		; CHECK-NEXT: vpsraw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe1,0xd1]
; CHECK-NEXT: vpsraw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe1,0xc1]		; CHECK-NEXT: vpsraw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe1,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psra.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psra.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psra.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psra.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psra.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psra.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psra.w.256(<16 x i16>, <8 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psra.w.256(<16 x i16>, <8 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psra_w_256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psra_w_256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraw %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0xe1,0xd9]		; CHECK-NEXT: vpsraw %xmm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe1,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsraw %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe1,0xd1]		; CHECK-NEXT: vpsraw %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe1,0xd1]
; CHECK-NEXT: vpsraw %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe1,0xc1]		; CHECK-NEXT: vpsraw %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe1,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc3]		; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psra.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psra.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psra.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psra.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psra.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psra.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psll.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psll.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psll_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psll_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0xf1,0xd9]		; CHECK-NEXT: vpsllw %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf1,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf1,0xd1]		; CHECK-NEXT: vpsllw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf1,0xd1]
; CHECK-NEXT: vpsllw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xf1,0xc1]		; CHECK-NEXT: vpsllw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xf1,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psll.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psll.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psll.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psll.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psll.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psll.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psll.w.256(<16 x i16>, <8 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psll.w.256(<16 x i16>, <8 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psll_w_256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psll_w_256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllw %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0xf1,0xd9]		; CHECK-NEXT: vpsllw %xmm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf1,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllw %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf1,0xd1]		; CHECK-NEXT: vpsllw %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf1,0xd1]
; CHECK-NEXT: vpsllw %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xf1,0xc1]		; CHECK-NEXT: vpsllw %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xf1,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc3]		; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psll.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psll.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psll.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psll.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psll.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psll.w.256(<16 x i16> %x0, <8 x i16> %x1, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psrl.wi.128(<8 x i16>, i32, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psrl.wi.128(<8 x i16>, i32, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psrl_wi_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psrl_wi_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_wi_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_wi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlw $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0x71,0xd0,0x03]		; CHECK-NEXT: vpsrlw $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x71,0xd0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsrlw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x71,0xd0,0x03]		; CHECK-NEXT: vpsrlw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x71,0xd0,0x03]
; CHECK-NEXT: vpsrlw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x71,0xd0,0x03]		; CHECK-NEXT: vpsrlw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x71,0xd0,0x03]
; CHECK-NEXT: vpaddw %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xca]		; CHECK-NEXT: vpaddw %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xca]
; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psrl.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psrl.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrl.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrl.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrl.wi.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrl.wi.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res2, %res3		%res4 = add <8 x i16> %res2, %res3
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psrl.wi.256(<16 x i16>, i32, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psrl.wi.256(<16 x i16>, i32, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psrl_wi_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psrl_wi_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_wi_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_wi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlw $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0x71,0xd0,0x03]		; CHECK-NEXT: vpsrlw $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x71,0xd0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsrlw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x71,0xd0,0x03]		; CHECK-NEXT: vpsrlw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x71,0xd0,0x03]
; CHECK-NEXT: vpsrlw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x71,0xd0,0x03]		; CHECK-NEXT: vpsrlw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x71,0xd0,0x03]
; CHECK-NEXT: vpaddw %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xca]		; CHECK-NEXT: vpaddw %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xca]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psrl.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psrl.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psrl.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psrl.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psrl.wi.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psrl.wi.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psra.wi.128(<8 x i16>, i32, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psra.wi.128(<8 x i16>, i32, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psra_wi_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psra_wi_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_wi_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_wi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraw $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0x71,0xe0,0x03]		; CHECK-NEXT: vpsraw $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x71,0xe0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsraw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x71,0xe0,0x03]		; CHECK-NEXT: vpsraw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x71,0xe0,0x03]
; CHECK-NEXT: vpsraw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x71,0xe0,0x03]		; CHECK-NEXT: vpsraw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x71,0xe0,0x03]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc2]		; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psra.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psra.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psra.wi.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psra.wi.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psra.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psra.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psra.wi.256(<16 x i16>, i32, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psra.wi.256(<16 x i16>, i32, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psra_wi_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psra_wi_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_wi_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_wi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraw $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0x71,0xe0,0x03]		; CHECK-NEXT: vpsraw $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x71,0xe0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsraw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x71,0xe0,0x03]		; CHECK-NEXT: vpsraw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x71,0xe0,0x03]
; CHECK-NEXT: vpsraw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x71,0xe0,0x03]		; CHECK-NEXT: vpsraw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x71,0xe0,0x03]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc2]		; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psra.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psra.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psra.wi.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psra.wi.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psra.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psra.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psll.wi.128(<8 x i16>, i32, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psll.wi.128(<8 x i16>, i32, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psll_wi_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psll_wi_128(<8 x i16> %x0, i32 %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_wi_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_wi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllw $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0x71,0xf0,0x03]		; CHECK-NEXT: vpsllw $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x71,0xf0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsllw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x71,0xf0,0x03]		; CHECK-NEXT: vpsllw $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x71,0xf0,0x03]
; CHECK-NEXT: vpsllw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x71,0xf0,0x03]		; CHECK-NEXT: vpsllw $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x71,0xf0,0x03]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc2]		; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psll.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psll.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psll.wi.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psll.wi.128(<8 x i16> %x0, i32 3, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psll.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psll.wi.128(<8 x i16> %x0, i32 3, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psll.wi.256(<16 x i16>, i32, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psll.wi.256(<16 x i16>, i32, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psll_wi_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psll_wi_256(<16 x i16> %x0, i32 %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_wi_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_wi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllw $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0x71,0xf0,0x03]		; CHECK-NEXT: vpsllw $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x71,0xf0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsllw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x71,0xf0,0x03]		; CHECK-NEXT: vpsllw $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x71,0xf0,0x03]
; CHECK-NEXT: vpsllw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x71,0xf0,0x03]		; CHECK-NEXT: vpsllw $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x71,0xf0,0x03]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc2]		; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psll.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psll.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psll.wi.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psll.wi.256(<16 x i16> %x0, i32 3, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psll.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psll.wi.256(<16 x i16> %x0, i32 3, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pshuf.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.pshuf.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_pshuf_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {		define <16 x i8>@test_int_x86_avx512_mask_pshuf_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pshuf_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pshuf_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpshufb %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0x7d,0x08,0x00,0xd9]		; CHECK-NEXT: vpshufb %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x00,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpshufb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x00,0xd1]		; CHECK-NEXT: vpshufb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x00,0xd1]
; CHECK-NEXT: vpaddb %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc3]		; CHECK-NEXT: vpaddb %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.pshuf.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)		%res = call <16 x i8> @llvm.x86.avx512.mask.pshuf.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pshuf.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pshuf.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.pshuf.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.pshuf.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_pshuf_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {		define <32 x i8>@test_int_x86_avx512_mask_pshuf_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pshuf_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pshuf_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpshufb %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x00,0xd9]		; CHECK-NEXT: vpshufb %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x00,0xd9]
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpshufb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x00,0xd1]		; CHECK-NEXT: vpshufb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x00,0xd1]
; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc3]		; CHECK-NEXT: vpaddb %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.pshuf.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)		%res = call <32 x i8> @llvm.x86.avx512.mask.pshuf.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.pshuf.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.pshuf.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmovzxb.w.128(<16 x i8>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmovzxb.w.128(<16 x i8>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmovzxb_w_128(<16 x i8> %x0, <8 x i16> %x1, i8 %x2) {		define <8 x i16>@test_int_x86_avx512_mask_pmovzxb_w_128(<16 x i8> %x0, <8 x i16> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxbw %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x30,0xd0]		; CHECK-NEXT: vpmovzxbw %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x30,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero		; CHECK-NEXT: ## xmm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxbw %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x30,0xc8]		; CHECK-NEXT: vpmovzxbw %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x30,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
; CHECK-NEXT: vpmovzxbw %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x30,0xc0]		; CHECK-NEXT: vpmovzxbw %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x30,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc2]		; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmovzxb.w.128(<16 x i8> %x0, <8 x i16> %x1, i8 %x2)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmovzxb.w.128(<16 x i8> %x0, <8 x i16> %x1, i8 %x2)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmovzxb.w.128(<16 x i8> %x0, <8 x i16> zeroinitializer, i8 %x2)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmovzxb.w.128(<16 x i8> %x0, <8 x i16> zeroinitializer, i8 %x2)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.pmovzxb.w.128(<16 x i8> %x0, <8 x i16> %x1, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.pmovzxb.w.128(<16 x i8> %x0, <8 x i16> %x1, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmovzxb.w.256(<16 x i8>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmovzxb.w.256(<16 x i8>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmovzxb_w_256(<16 x i8> %x0, <16 x i16> %x1, i16 %x2) {		define <16 x i16>@test_int_x86_avx512_mask_pmovzxb_w_256(<16 x i8> %x0, <16 x i16> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxbw %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x30,0xd0]		; CHECK-NEXT: vpmovzxbw %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x30,0xd0]
; CHECK-NEXT: ## ymm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero		; CHECK-NEXT: ## ymm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxbw %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x30,0xc8]		; CHECK-NEXT: vpmovzxbw %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x30,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero		; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero
; CHECK-NEXT: vpmovzxbw %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x30,0xc0]		; CHECK-NEXT: vpmovzxbw %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x30,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero		; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero,xmm0[8],zero,xmm0[9],zero,xmm0[10],zero,xmm0[11],zero,xmm0[12],zero,xmm0[13],zero,xmm0[14],zero,xmm0[15],zero
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc2]		; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmovzxb.w.256(<16 x i8> %x0, <16 x i16> %x1, i16 %x2)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmovzxb.w.256(<16 x i8> %x0, <16 x i16> %x1, i16 %x2)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmovzxb.w.256(<16 x i8> %x0, <16 x i16> zeroinitializer, i16 %x2)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmovzxb.w.256(<16 x i8> %x0, <16 x i16> zeroinitializer, i16 %x2)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.pmovzxb.w.256(<16 x i8> %x0, <16 x i16> %x1, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.pmovzxb.w.256(<16 x i8> %x0, <16 x i16> %x1, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}


declare <8 x i16> @llvm.x86.avx512.mask.pmovsxb.w.128(<16 x i8>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmovsxb.w.128(<16 x i8>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmovsxb_w_128(<16 x i8> %x0, <8 x i16> %x1, i8 %x2) {		define <8 x i16>@test_int_x86_avx512_mask_pmovsxb_w_128(<16 x i8> %x0, <8 x i16> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxbw %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x20,0xd0]		; CHECK-NEXT: vpmovsxbw %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x20,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxbw %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x20,0xc8]		; CHECK-NEXT: vpmovsxbw %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x20,0xc8]
; CHECK-NEXT: vpmovsxbw %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x20,0xc0]		; CHECK-NEXT: vpmovsxbw %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x20,0xc0]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc2]		; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmovsxb.w.128(<16 x i8> %x0, <8 x i16> %x1, i8 %x2)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmovsxb.w.128(<16 x i8> %x0, <8 x i16> %x1, i8 %x2)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmovsxb.w.128(<16 x i8> %x0, <8 x i16> zeroinitializer, i8 %x2)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmovsxb.w.128(<16 x i8> %x0, <8 x i16> zeroinitializer, i8 %x2)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.pmovsxb.w.128(<16 x i8> %x0, <8 x i16> %x1, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.pmovsxb.w.128(<16 x i8> %x0, <8 x i16> %x1, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmovsxb.w.256(<16 x i8>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmovsxb.w.256(<16 x i8>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmovsxb_w_256(<16 x i8> %x0, <16 x i16> %x1, i16 %x2) {		define <16 x i16>@test_int_x86_avx512_mask_pmovsxb_w_256(<16 x i8> %x0, <16 x i16> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxbw %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x20,0xd0]		; CHECK-NEXT: vpmovsxbw %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x20,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxbw %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x20,0xc8]		; CHECK-NEXT: vpmovsxbw %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x20,0xc8]
; CHECK-NEXT: vpmovsxbw %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x20,0xc0]		; CHECK-NEXT: vpmovsxbw %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x20,0xc0]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc2]		; CHECK-NEXT: vpaddw %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmovsxb.w.256(<16 x i8> %x0, <16 x i16> %x1, i16 %x2)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmovsxb.w.256(<16 x i8> %x0, <16 x i16> %x1, i16 %x2)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmovsxb.w.256(<16 x i8> %x0, <16 x i16> zeroinitializer, i16 %x2)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmovsxb.w.256(<16 x i8> %x0, <16 x i16> zeroinitializer, i16 %x2)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.pmovsxb.w.256(<16 x i8> %x0, <16 x i16> %x1, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.pmovsxb.w.256(<16 x i8> %x0, <16 x i16> %x1, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmovsxd.q.128(<4 x i32>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmovsxd.q.128(<4 x i32>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmovsxd_q_128(<4 x i32> %x0, <2 x i64> %x1, i8 %x2) {		define <2 x i64>@test_int_x86_avx512_mask_pmovsxd_q_128(<4 x i32> %x0, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxd_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxd_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxdq %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x25,0xd0]		; CHECK-NEXT: vpmovsxdq %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x25,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxdq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x25,0xc8]		; CHECK-NEXT: vpmovsxdq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x25,0xc8]
; CHECK-NEXT: vpmovsxdq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x25,0xc0]		; CHECK-NEXT: vpmovsxdq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x25,0xc0]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc2]		; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmovsxd.q.128(<4 x i32> %x0, <2 x i64> %x1, i8 %x2)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmovsxd.q.128(<4 x i32> %x0, <2 x i64> %x1, i8 %x2)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxd.q.128(<4 x i32> %x0, <2 x i64> zeroinitializer, i8 %x2)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxd.q.128(<4 x i32> %x0, <2 x i64> zeroinitializer, i8 %x2)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxd.q.128(<4 x i32> %x0, <2 x i64> %x1, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxd.q.128(<4 x i32> %x0, <2 x i64> %x1, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmovsxd.q.256(<4 x i32>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmovsxd.q.256(<4 x i32>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmovsxd_q_256(<4 x i32> %x0, <4 x i64> %x1, i8 %x2) {		define <4 x i64>@test_int_x86_avx512_mask_pmovsxd_q_256(<4 x i32> %x0, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxd_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxd_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxdq %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x25,0xd0]		; CHECK-NEXT: vpmovsxdq %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x25,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxdq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x25,0xc8]		; CHECK-NEXT: vpmovsxdq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x25,0xc8]
; CHECK-NEXT: vpmovsxdq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x25,0xc0]		; CHECK-NEXT: vpmovsxdq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x25,0xc0]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc2]		; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmovsxd.q.256(<4 x i32> %x0, <4 x i64> %x1, i8 %x2)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmovsxd.q.256(<4 x i32> %x0, <4 x i64> %x1, i8 %x2)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxd.q.256(<4 x i32> %x0, <4 x i64> zeroinitializer, i8 %x2)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxd.q.256(<4 x i32> %x0, <4 x i64> zeroinitializer, i8 %x2)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxd.q.256(<4 x i32> %x0, <4 x i64> %x1, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxd.q.256(<4 x i32> %x0, <4 x i64> %x1, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

llvm/trunk/test/CodeGen/X86/avx512bwvl-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 16 Lines
; CHECK-NEXT: vpcmpneqb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc1,0x04]		; CHECK-NEXT: vpcmpneqb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc1,0x04]
; CHECK-NEXT: kmovd %k0, %edi ## encoding: [0xc5,0xfb,0x93,0xf8]		; CHECK-NEXT: kmovd %k0, %edi ## encoding: [0xc5,0xfb,0x93,0xf8]
; CHECK-NEXT: vpcmpnltb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc1,0x05]		; CHECK-NEXT: vpcmpnltb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc1,0x05]
; CHECK-NEXT: kmovd %k0, %eax ## encoding: [0xc5,0xfb,0x93,0xc0]		; CHECK-NEXT: kmovd %k0, %eax ## encoding: [0xc5,0xfb,0x93,0xc0]
; CHECK-NEXT: vpcmpnleb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc1,0x06]		; CHECK-NEXT: vpcmpnleb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc1,0x06]
; CHECK-NEXT: kmovd %k0, %ecx ## encoding: [0xc5,0xfb,0x93,0xc8]		; CHECK-NEXT: kmovd %k0, %ecx ## encoding: [0xc5,0xfb,0x93,0xc8]
; CHECK-NEXT: vpcmpordb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc1,0x07]		; CHECK-NEXT: vpcmpordb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc1,0x07]
; CHECK-NEXT: kmovd %k0, %edx ## encoding: [0xc5,0xfb,0x93,0xd0]		; CHECK-NEXT: kmovd %k0, %edx ## encoding: [0xc5,0xfb,0x93,0xd0]
; CHECK-NEXT: vmovd %edi, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc7]		; CHECK-NEXT: vmovd %edi, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc7]
; CHECK-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc0,0x01]		; CHECK-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc0,0x01]
; CHECK-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc1,0x02]		; CHECK-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc1,0x02]
; CHECK-NEXT: vpinsrd $3, %edx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc2,0x03]		; CHECK-NEXT: vpinsrd $3, %edx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc2,0x03]
; CHECK-NEXT: vmovd %r8d, %xmm1 ## encoding: [0x62,0xd1,0x7d,0x08,0x6e,0xc8]		; CHECK-NEXT: vmovd %r8d, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xc1,0x79,0x6e,0xc8]
; CHECK-NEXT: vpinsrd $1, %r9d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xc9,0x01]		; CHECK-NEXT: vpinsrd $1, %r9d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xc9,0x01]
; CHECK-NEXT: vpinsrd $2, %r10d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xca,0x02]		; CHECK-NEXT: vpinsrd $2, %r10d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xca,0x02]
; CHECK-NEXT: vpinsrd $3, %esi, %xmm1, %xmm1 ## encoding: [0xc4,0xe3,0x71,0x22,0xce,0x03]		; CHECK-NEXT: vpinsrd $3, %esi, %xmm1, %xmm1 ## encoding: [0xc4,0xe3,0x71,0x22,0xce,0x03]
; CHECK-NEXT: vinserti32x4 $1, %xmm0, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x38,0xc0,0x01]		; CHECK-NEXT: vinserti32x4 $1, %xmm0, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x38,0xc0,0x01]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i32 @llvm.x86.avx512.mask.cmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 0, i32 -1)		%res0 = call i32 @llvm.x86.avx512.mask.cmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 0, i32 -1)
%vec0 = insertelement <8 x i32> undef, i32 %res0, i32 0		%vec0 = insertelement <8 x i32> undef, i32 %res0, i32 0
%res1 = call i32 @llvm.x86.avx512.mask.cmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 1, i32 -1)		%res1 = call i32 @llvm.x86.avx512.mask.cmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 1, i32 -1)
Show All 28 Lines
; CHECK-NEXT: vpcmpneqb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3f,0xc1,0x04]		; CHECK-NEXT: vpcmpneqb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3f,0xc1,0x04]
; CHECK-NEXT: kmovd %k0, %edi ## encoding: [0xc5,0xfb,0x93,0xf8]		; CHECK-NEXT: kmovd %k0, %edi ## encoding: [0xc5,0xfb,0x93,0xf8]
; CHECK-NEXT: vpcmpnltb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3f,0xc1,0x05]		; CHECK-NEXT: vpcmpnltb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3f,0xc1,0x05]
; CHECK-NEXT: kmovd %k0, %eax ## encoding: [0xc5,0xfb,0x93,0xc0]		; CHECK-NEXT: kmovd %k0, %eax ## encoding: [0xc5,0xfb,0x93,0xc0]
; CHECK-NEXT: vpcmpnleb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3f,0xc1,0x06]		; CHECK-NEXT: vpcmpnleb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3f,0xc1,0x06]
; CHECK-NEXT: kmovd %k0, %ecx ## encoding: [0xc5,0xfb,0x93,0xc8]		; CHECK-NEXT: kmovd %k0, %ecx ## encoding: [0xc5,0xfb,0x93,0xc8]
; CHECK-NEXT: vpcmpordb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3f,0xc1,0x07]		; CHECK-NEXT: vpcmpordb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3f,0xc1,0x07]
; CHECK-NEXT: kmovd %k0, %edx ## encoding: [0xc5,0xfb,0x93,0xd0]		; CHECK-NEXT: kmovd %k0, %edx ## encoding: [0xc5,0xfb,0x93,0xd0]
; CHECK-NEXT: vmovd %edi, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc7]		; CHECK-NEXT: vmovd %edi, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc7]
; CHECK-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc0,0x01]		; CHECK-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc0,0x01]
; CHECK-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc1,0x02]		; CHECK-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc1,0x02]
; CHECK-NEXT: vpinsrd $3, %edx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc2,0x03]		; CHECK-NEXT: vpinsrd $3, %edx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc2,0x03]
; CHECK-NEXT: vmovd %r8d, %xmm1 ## encoding: [0x62,0xd1,0x7d,0x08,0x6e,0xc8]		; CHECK-NEXT: vmovd %r8d, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xc1,0x79,0x6e,0xc8]
; CHECK-NEXT: vpinsrd $1, %r9d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xc9,0x01]		; CHECK-NEXT: vpinsrd $1, %r9d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xc9,0x01]
; CHECK-NEXT: vpinsrd $2, %r10d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xca,0x02]		; CHECK-NEXT: vpinsrd $2, %r10d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xca,0x02]
; CHECK-NEXT: vpinsrd $3, %esi, %xmm1, %xmm1 ## encoding: [0xc4,0xe3,0x71,0x22,0xce,0x03]		; CHECK-NEXT: vpinsrd $3, %esi, %xmm1, %xmm1 ## encoding: [0xc4,0xe3,0x71,0x22,0xce,0x03]
; CHECK-NEXT: vinserti32x4 $1, %xmm0, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x38,0xc0,0x01]		; CHECK-NEXT: vinserti32x4 $1, %xmm0, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x38,0xc0,0x01]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i32 @llvm.x86.avx512.mask.cmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 0, i32 %mask)		%res0 = call i32 @llvm.x86.avx512.mask.cmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 0, i32 %mask)
%vec0 = insertelement <8 x i32> undef, i32 %res0, i32 0		%vec0 = insertelement <8 x i32> undef, i32 %res0, i32 0
%res1 = call i32 @llvm.x86.avx512.mask.cmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 1, i32 %mask)		%res1 = call i32 @llvm.x86.avx512.mask.cmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 1, i32 %mask)
Show All 29 Lines
; CHECK-NEXT: vpcmpnequb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3e,0xc1,0x04]		; CHECK-NEXT: vpcmpnequb %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3e,0xc1,0x04]
; CHECK-NEXT: kmovd %k0, %edi ## encoding: [0xc5,0xfb,0x93,0xf8]		; CHECK-NEXT: kmovd %k0, %edi ## encoding: [0xc5,0xfb,0x93,0xf8]
; CHECK-NEXT: vpcmpnltub %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3e,0xc1,0x05]		; CHECK-NEXT: vpcmpnltub %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3e,0xc1,0x05]
; CHECK-NEXT: kmovd %k0, %eax ## encoding: [0xc5,0xfb,0x93,0xc0]		; CHECK-NEXT: kmovd %k0, %eax ## encoding: [0xc5,0xfb,0x93,0xc0]
; CHECK-NEXT: vpcmpnleub %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3e,0xc1,0x06]		; CHECK-NEXT: vpcmpnleub %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3e,0xc1,0x06]
; CHECK-NEXT: kmovd %k0, %ecx ## encoding: [0xc5,0xfb,0x93,0xc8]		; CHECK-NEXT: kmovd %k0, %ecx ## encoding: [0xc5,0xfb,0x93,0xc8]
; CHECK-NEXT: vpcmpordub %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3e,0xc1,0x07]		; CHECK-NEXT: vpcmpordub %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x28,0x3e,0xc1,0x07]
; CHECK-NEXT: kmovd %k0, %edx ## encoding: [0xc5,0xfb,0x93,0xd0]		; CHECK-NEXT: kmovd %k0, %edx ## encoding: [0xc5,0xfb,0x93,0xd0]
; CHECK-NEXT: vmovd %edi, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc7]		; CHECK-NEXT: vmovd %edi, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc7]
; CHECK-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc0,0x01]		; CHECK-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc0,0x01]
; CHECK-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc1,0x02]		; CHECK-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc1,0x02]
; CHECK-NEXT: vpinsrd $3, %edx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc2,0x03]		; CHECK-NEXT: vpinsrd $3, %edx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc2,0x03]
; CHECK-NEXT: vmovd %r8d, %xmm1 ## encoding: [0x62,0xd1,0x7d,0x08,0x6e,0xc8]		; CHECK-NEXT: vmovd %r8d, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xc1,0x79,0x6e,0xc8]
; CHECK-NEXT: vpinsrd $1, %r9d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xc9,0x01]		; CHECK-NEXT: vpinsrd $1, %r9d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xc9,0x01]
; CHECK-NEXT: vpinsrd $2, %r10d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xca,0x02]		; CHECK-NEXT: vpinsrd $2, %r10d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xca,0x02]
; CHECK-NEXT: vpinsrd $3, %esi, %xmm1, %xmm1 ## encoding: [0xc4,0xe3,0x71,0x22,0xce,0x03]		; CHECK-NEXT: vpinsrd $3, %esi, %xmm1, %xmm1 ## encoding: [0xc4,0xe3,0x71,0x22,0xce,0x03]
; CHECK-NEXT: vinserti32x4 $1, %xmm0, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x38,0xc0,0x01]		; CHECK-NEXT: vinserti32x4 $1, %xmm0, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x38,0xc0,0x01]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i32 @llvm.x86.avx512.mask.ucmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 0, i32 -1)		%res0 = call i32 @llvm.x86.avx512.mask.ucmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 0, i32 -1)
%vec0 = insertelement <8 x i32> undef, i32 %res0, i32 0		%vec0 = insertelement <8 x i32> undef, i32 %res0, i32 0
%res1 = call i32 @llvm.x86.avx512.mask.ucmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 1, i32 -1)		%res1 = call i32 @llvm.x86.avx512.mask.ucmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 1, i32 -1)
Show All 28 Lines
; CHECK-NEXT: vpcmpnequb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3e,0xc1,0x04]		; CHECK-NEXT: vpcmpnequb %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3e,0xc1,0x04]
; CHECK-NEXT: kmovd %k0, %edi ## encoding: [0xc5,0xfb,0x93,0xf8]		; CHECK-NEXT: kmovd %k0, %edi ## encoding: [0xc5,0xfb,0x93,0xf8]
; CHECK-NEXT: vpcmpnltub %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3e,0xc1,0x05]		; CHECK-NEXT: vpcmpnltub %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3e,0xc1,0x05]
; CHECK-NEXT: kmovd %k0, %eax ## encoding: [0xc5,0xfb,0x93,0xc0]		; CHECK-NEXT: kmovd %k0, %eax ## encoding: [0xc5,0xfb,0x93,0xc0]
; CHECK-NEXT: vpcmpnleub %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3e,0xc1,0x06]		; CHECK-NEXT: vpcmpnleub %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3e,0xc1,0x06]
; CHECK-NEXT: kmovd %k0, %ecx ## encoding: [0xc5,0xfb,0x93,0xc8]		; CHECK-NEXT: kmovd %k0, %ecx ## encoding: [0xc5,0xfb,0x93,0xc8]
; CHECK-NEXT: vpcmpordub %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3e,0xc1,0x07]		; CHECK-NEXT: vpcmpordub %ymm1, %ymm0, %k0 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x3e,0xc1,0x07]
; CHECK-NEXT: kmovd %k0, %edx ## encoding: [0xc5,0xfb,0x93,0xd0]		; CHECK-NEXT: kmovd %k0, %edx ## encoding: [0xc5,0xfb,0x93,0xd0]
; CHECK-NEXT: vmovd %edi, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc7]		; CHECK-NEXT: vmovd %edi, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc7]
; CHECK-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc0,0x01]		; CHECK-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc0,0x01]
; CHECK-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc1,0x02]		; CHECK-NEXT: vpinsrd $2, %ecx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc1,0x02]
; CHECK-NEXT: vpinsrd $3, %edx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc2,0x03]		; CHECK-NEXT: vpinsrd $3, %edx, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x22,0xc2,0x03]
; CHECK-NEXT: vmovd %r8d, %xmm1 ## encoding: [0x62,0xd1,0x7d,0x08,0x6e,0xc8]		; CHECK-NEXT: vmovd %r8d, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xc1,0x79,0x6e,0xc8]
; CHECK-NEXT: vpinsrd $1, %r9d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xc9,0x01]		; CHECK-NEXT: vpinsrd $1, %r9d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xc9,0x01]
; CHECK-NEXT: vpinsrd $2, %r10d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xca,0x02]		; CHECK-NEXT: vpinsrd $2, %r10d, %xmm1, %xmm1 ## encoding: [0xc4,0xc3,0x71,0x22,0xca,0x02]
; CHECK-NEXT: vpinsrd $3, %esi, %xmm1, %xmm1 ## encoding: [0xc4,0xe3,0x71,0x22,0xce,0x03]		; CHECK-NEXT: vpinsrd $3, %esi, %xmm1, %xmm1 ## encoding: [0xc4,0xe3,0x71,0x22,0xce,0x03]
; CHECK-NEXT: vinserti32x4 $1, %xmm0, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x38,0xc0,0x01]		; CHECK-NEXT: vinserti32x4 $1, %xmm0, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x38,0xc0,0x01]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i32 @llvm.x86.avx512.mask.ucmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 0, i32 %mask)		%res0 = call i32 @llvm.x86.avx512.mask.ucmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 0, i32 %mask)
%vec0 = insertelement <8 x i32> undef, i32 %res0, i32 0		%vec0 = insertelement <8 x i32> undef, i32 %res0, i32 0
%res1 = call i32 @llvm.x86.avx512.mask.ucmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 1, i32 %mask)		%res1 = call i32 @llvm.x86.avx512.mask.ucmp.b.256(<32 x i8> %a0, <32 x i8> %a1, i32 1, i32 %mask)
Show All 23 Lines
; CHECK-NEXT: vpcmplew %ymm1, %ymm0, %k5 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xe9,0x02]		; CHECK-NEXT: vpcmplew %ymm1, %ymm0, %k5 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xe9,0x02]
; CHECK-NEXT: vpcmpunordw %ymm1, %ymm0, %k6 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xf1,0x03]		; CHECK-NEXT: vpcmpunordw %ymm1, %ymm0, %k6 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xf1,0x03]
; CHECK-NEXT: vpcmpneqw %ymm1, %ymm0, %k7 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xf9,0x04]		; CHECK-NEXT: vpcmpneqw %ymm1, %ymm0, %k7 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xf9,0x04]
; CHECK-NEXT: vpcmpnltw %ymm1, %ymm0, %k2 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xd1,0x05]		; CHECK-NEXT: vpcmpnltw %ymm1, %ymm0, %k2 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xd1,0x05]
; CHECK-NEXT: vpcmpnlew %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xc9,0x06]		; CHECK-NEXT: vpcmpnlew %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xc9,0x06]
; CHECK-NEXT: vpcmpordw %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xc1,0x07]		; CHECK-NEXT: vpcmpordw %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xc1,0x07]
; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]		; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]
; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]		; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]
; CHECK-NEXT: vmovd %ecx, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc1]		; CHECK-NEXT: vmovd %ecx, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc1]
; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x01]		; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x01]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x02]		; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x02]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x03]		; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x03]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x04]		; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x04]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x05]		; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x05]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x06]		; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x06]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x07]		; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 0, i16 -1)		%res0 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 0, i16 -1)
%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0		%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0
%res1 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 1, i16 -1)		%res1 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 1, i16 -1)
%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1		%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1
%res2 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 2, i16 -1)		%res2 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 2, i16 -1)
%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2		%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2
%res3 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 3, i16 -1)		%res3 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 3, i16 -1)
Show All 18 Lines
; CHECK-NEXT: vpcmplew %ymm1, %ymm0, %k6 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xf1,0x02]		; CHECK-NEXT: vpcmplew %ymm1, %ymm0, %k6 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xf1,0x02]
; CHECK-NEXT: vpcmpunordw %ymm1, %ymm0, %k7 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xf9,0x03]		; CHECK-NEXT: vpcmpunordw %ymm1, %ymm0, %k7 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xf9,0x03]
; CHECK-NEXT: vpcmpneqw %ymm1, %ymm0, %k0 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xc1,0x04]		; CHECK-NEXT: vpcmpneqw %ymm1, %ymm0, %k0 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xc1,0x04]
; CHECK-NEXT: vpcmpnltw %ymm1, %ymm0, %k2 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xd1,0x05]		; CHECK-NEXT: vpcmpnltw %ymm1, %ymm0, %k2 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xd1,0x05]
; CHECK-NEXT: vpcmpnlew %ymm1, %ymm0, %k1 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xc9,0x06]		; CHECK-NEXT: vpcmpnlew %ymm1, %ymm0, %k1 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xc9,0x06]
; CHECK-NEXT: vpcmpordw %ymm1, %ymm0, %k3 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xd9,0x07]		; CHECK-NEXT: vpcmpordw %ymm1, %ymm0, %k3 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3f,0xd9,0x07]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]		; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]
; CHECK-NEXT: vmovd %ecx, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc1]		; CHECK-NEXT: vmovd %ecx, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc1]
; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x01]		; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x01]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x02]		; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x02]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x03]		; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x03]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x04]		; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x04]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x05]		; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x05]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x06]		; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x06]
; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]		; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]
; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x07]		; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 0, i16 %mask)		%res0 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 0, i16 %mask)
%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0		%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0
%res1 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 1, i16 %mask)		%res1 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 1, i16 %mask)
%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1		%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1
%res2 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 2, i16 %mask)		%res2 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 2, i16 %mask)
%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2		%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2
%res3 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 3, i16 %mask)		%res3 = call i16 @llvm.x86.avx512.mask.cmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 3, i16 %mask)
Show All 19 Lines
; CHECK-NEXT: vpcmpleuw %ymm1, %ymm0, %k5 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xe9,0x02]		; CHECK-NEXT: vpcmpleuw %ymm1, %ymm0, %k5 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xe9,0x02]
; CHECK-NEXT: vpcmpunorduw %ymm1, %ymm0, %k6 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xf1,0x03]		; CHECK-NEXT: vpcmpunorduw %ymm1, %ymm0, %k6 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xf1,0x03]
; CHECK-NEXT: vpcmpnequw %ymm1, %ymm0, %k7 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xf9,0x04]		; CHECK-NEXT: vpcmpnequw %ymm1, %ymm0, %k7 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xf9,0x04]
; CHECK-NEXT: vpcmpnltuw %ymm1, %ymm0, %k2 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xd1,0x05]		; CHECK-NEXT: vpcmpnltuw %ymm1, %ymm0, %k2 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xd1,0x05]
; CHECK-NEXT: vpcmpnleuw %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xc9,0x06]		; CHECK-NEXT: vpcmpnleuw %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xc9,0x06]
; CHECK-NEXT: vpcmporduw %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xc1,0x07]		; CHECK-NEXT: vpcmporduw %ymm1, %ymm0, %k0 ## encoding: [0x62,0xf3,0xfd,0x28,0x3e,0xc1,0x07]
; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]		; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]
; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]		; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]
; CHECK-NEXT: vmovd %ecx, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc1]		; CHECK-NEXT: vmovd %ecx, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc1]
; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x01]		; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x01]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x02]		; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x02]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x03]		; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x03]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x04]		; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x04]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x05]		; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x05]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x06]		; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x06]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x07]		; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 0, i16 -1)		%res0 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 0, i16 -1)
%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0		%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0
%res1 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 1, i16 -1)		%res1 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 1, i16 -1)
%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1		%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1
%res2 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 2, i16 -1)		%res2 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 2, i16 -1)
%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2		%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2
%res3 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 3, i16 -1)		%res3 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 3, i16 -1)
Show All 18 Lines
; CHECK-NEXT: vpcmpleuw %ymm1, %ymm0, %k6 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xf1,0x02]		; CHECK-NEXT: vpcmpleuw %ymm1, %ymm0, %k6 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xf1,0x02]
; CHECK-NEXT: vpcmpunorduw %ymm1, %ymm0, %k7 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xf9,0x03]		; CHECK-NEXT: vpcmpunorduw %ymm1, %ymm0, %k7 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xf9,0x03]
; CHECK-NEXT: vpcmpnequw %ymm1, %ymm0, %k0 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xc1,0x04]		; CHECK-NEXT: vpcmpnequw %ymm1, %ymm0, %k0 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xc1,0x04]
; CHECK-NEXT: vpcmpnltuw %ymm1, %ymm0, %k2 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xd1,0x05]		; CHECK-NEXT: vpcmpnltuw %ymm1, %ymm0, %k2 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xd1,0x05]
; CHECK-NEXT: vpcmpnleuw %ymm1, %ymm0, %k1 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xc9,0x06]		; CHECK-NEXT: vpcmpnleuw %ymm1, %ymm0, %k1 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xc9,0x06]
; CHECK-NEXT: vpcmporduw %ymm1, %ymm0, %k3 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xd9,0x07]		; CHECK-NEXT: vpcmporduw %ymm1, %ymm0, %k3 {%k3} ## encoding: [0x62,0xf3,0xfd,0x2b,0x3e,0xd9,0x07]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]		; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]
; CHECK-NEXT: vmovd %ecx, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc1]		; CHECK-NEXT: vmovd %ecx, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc1]
; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x01]		; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x01]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x02]		; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x02]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x03]		; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x03]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x04]		; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x04]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x05]		; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x05]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x06]		; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x06]
; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]		; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]
; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x07]		; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 0, i16 %mask)		%res0 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 0, i16 %mask)
%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0		%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0
%res1 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 1, i16 %mask)		%res1 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 1, i16 %mask)
%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1		%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1
%res2 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 2, i16 %mask)		%res2 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 2, i16 %mask)
%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2		%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2
%res3 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 3, i16 %mask)		%res3 = call i16 @llvm.x86.avx512.mask.ucmp.w.256(<16 x i16> %a0, <16 x i16> %a1, i32 3, i16 %mask)
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines
; CHECK-NEXT: vpcmpleb %xmm1, %xmm0, %k5 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xe9,0x02]		; CHECK-NEXT: vpcmpleb %xmm1, %xmm0, %k5 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xe9,0x02]
; CHECK-NEXT: vpcmpunordb %xmm1, %xmm0, %k6 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xf1,0x03]		; CHECK-NEXT: vpcmpunordb %xmm1, %xmm0, %k6 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xf1,0x03]
; CHECK-NEXT: vpcmpneqb %xmm1, %xmm0, %k7 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xf9,0x04]		; CHECK-NEXT: vpcmpneqb %xmm1, %xmm0, %k7 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xf9,0x04]
; CHECK-NEXT: vpcmpnltb %xmm1, %xmm0, %k2 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xd1,0x05]		; CHECK-NEXT: vpcmpnltb %xmm1, %xmm0, %k2 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xd1,0x05]
; CHECK-NEXT: vpcmpnleb %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xc9,0x06]		; CHECK-NEXT: vpcmpnleb %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xc9,0x06]
; CHECK-NEXT: vpcmpordb %xmm1, %xmm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xc1,0x07]		; CHECK-NEXT: vpcmpordb %xmm1, %xmm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xc1,0x07]
; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]		; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]
; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]		; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]
; CHECK-NEXT: vmovd %ecx, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc1]		; CHECK-NEXT: vmovd %ecx, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc1]
; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x01]		; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x01]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x02]		; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x02]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x03]		; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x03]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x04]		; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x04]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x05]		; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x05]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x06]		; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x06]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x07]		; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 0, i16 -1)		%res0 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 0, i16 -1)
%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0		%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0
%res1 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 1, i16 -1)		%res1 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 1, i16 -1)
%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1		%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1
%res2 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 2, i16 -1)		%res2 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 2, i16 -1)
%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2		%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2
%res3 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 3, i16 -1)		%res3 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 3, i16 -1)
Show All 18 Lines
; CHECK-NEXT: vpcmpleb %xmm1, %xmm0, %k6 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xf1,0x02]		; CHECK-NEXT: vpcmpleb %xmm1, %xmm0, %k6 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xf1,0x02]
; CHECK-NEXT: vpcmpunordb %xmm1, %xmm0, %k7 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xf9,0x03]		; CHECK-NEXT: vpcmpunordb %xmm1, %xmm0, %k7 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xf9,0x03]
; CHECK-NEXT: vpcmpneqb %xmm1, %xmm0, %k0 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xc1,0x04]		; CHECK-NEXT: vpcmpneqb %xmm1, %xmm0, %k0 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xc1,0x04]
; CHECK-NEXT: vpcmpnltb %xmm1, %xmm0, %k2 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xd1,0x05]		; CHECK-NEXT: vpcmpnltb %xmm1, %xmm0, %k2 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xd1,0x05]
; CHECK-NEXT: vpcmpnleb %xmm1, %xmm0, %k1 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xc9,0x06]		; CHECK-NEXT: vpcmpnleb %xmm1, %xmm0, %k1 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xc9,0x06]
; CHECK-NEXT: vpcmpordb %xmm1, %xmm0, %k3 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xd9,0x07]		; CHECK-NEXT: vpcmpordb %xmm1, %xmm0, %k3 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3f,0xd9,0x07]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]		; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]
; CHECK-NEXT: vmovd %ecx, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc1]		; CHECK-NEXT: vmovd %ecx, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc1]
; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x01]		; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x01]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x02]		; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x02]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x03]		; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x03]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x04]		; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x04]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x05]		; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x05]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x06]		; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x06]
; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]		; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]
; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x07]		; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 0, i16 %mask)		%res0 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 0, i16 %mask)
%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0		%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0
%res1 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 1, i16 %mask)		%res1 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 1, i16 %mask)
%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1		%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1
%res2 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 2, i16 %mask)		%res2 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 2, i16 %mask)
%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2		%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2
%res3 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 3, i16 %mask)		%res3 = call i16 @llvm.x86.avx512.mask.cmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 3, i16 %mask)
Show All 19 Lines
; CHECK-NEXT: vpcmpleub %xmm1, %xmm0, %k5 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xe9,0x02]		; CHECK-NEXT: vpcmpleub %xmm1, %xmm0, %k5 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xe9,0x02]
; CHECK-NEXT: vpcmpunordub %xmm1, %xmm0, %k6 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xf1,0x03]		; CHECK-NEXT: vpcmpunordub %xmm1, %xmm0, %k6 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xf1,0x03]
; CHECK-NEXT: vpcmpnequb %xmm1, %xmm0, %k7 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xf9,0x04]		; CHECK-NEXT: vpcmpnequb %xmm1, %xmm0, %k7 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xf9,0x04]
; CHECK-NEXT: vpcmpnltub %xmm1, %xmm0, %k2 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xd1,0x05]		; CHECK-NEXT: vpcmpnltub %xmm1, %xmm0, %k2 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xd1,0x05]
; CHECK-NEXT: vpcmpnleub %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xc9,0x06]		; CHECK-NEXT: vpcmpnleub %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xc9,0x06]
; CHECK-NEXT: vpcmpordub %xmm1, %xmm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xc1,0x07]		; CHECK-NEXT: vpcmpordub %xmm1, %xmm0, %k0 ## encoding: [0x62,0xf3,0x7d,0x08,0x3e,0xc1,0x07]
; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]		; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]
; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]		; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]
; CHECK-NEXT: vmovd %ecx, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc1]		; CHECK-NEXT: vmovd %ecx, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc1]
; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x01]		; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x01]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x02]		; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x02]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x03]		; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x03]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x04]		; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x04]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x05]		; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x05]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x06]		; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x06]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x07]		; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 0, i16 -1)		%res0 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 0, i16 -1)
%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0		%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0
%res1 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 1, i16 -1)		%res1 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 1, i16 -1)
%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1		%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1
%res2 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 2, i16 -1)		%res2 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 2, i16 -1)
%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2		%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2
%res3 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 3, i16 -1)		%res3 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 3, i16 -1)
Show All 18 Lines
; CHECK-NEXT: vpcmpleub %xmm1, %xmm0, %k6 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xf1,0x02]		; CHECK-NEXT: vpcmpleub %xmm1, %xmm0, %k6 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xf1,0x02]
; CHECK-NEXT: vpcmpunordub %xmm1, %xmm0, %k7 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xf9,0x03]		; CHECK-NEXT: vpcmpunordub %xmm1, %xmm0, %k7 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xf9,0x03]
; CHECK-NEXT: vpcmpnequb %xmm1, %xmm0, %k0 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xc1,0x04]		; CHECK-NEXT: vpcmpnequb %xmm1, %xmm0, %k0 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xc1,0x04]
; CHECK-NEXT: vpcmpnltub %xmm1, %xmm0, %k2 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xd1,0x05]		; CHECK-NEXT: vpcmpnltub %xmm1, %xmm0, %k2 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xd1,0x05]
; CHECK-NEXT: vpcmpnleub %xmm1, %xmm0, %k1 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xc9,0x06]		; CHECK-NEXT: vpcmpnleub %xmm1, %xmm0, %k1 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xc9,0x06]
; CHECK-NEXT: vpcmpordub %xmm1, %xmm0, %k3 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xd9,0x07]		; CHECK-NEXT: vpcmpordub %xmm1, %xmm0, %k3 {%k3} ## encoding: [0x62,0xf3,0x7d,0x0b,0x3e,0xd9,0x07]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]		; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]
; CHECK-NEXT: vmovd %ecx, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6e,0xc1]		; CHECK-NEXT: vmovd %ecx, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6e,0xc1]
; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x01]		; CHECK-NEXT: vpinsrw $1, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x01]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x02]		; CHECK-NEXT: vpinsrw $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x02]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x03]		; CHECK-NEXT: vpinsrw $3, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x03]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x04]		; CHECK-NEXT: vpinsrw $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x04]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x05]		; CHECK-NEXT: vpinsrw $5, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x05]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x06]		; CHECK-NEXT: vpinsrw $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x06]
; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]		; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]
; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xc4,0xc0,0x07]		; CHECK-NEXT: vpinsrw $7, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc4,0xc0,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 0, i16 %mask)		%res0 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 0, i16 %mask)
%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0		%vec0 = insertelement <8 x i16> undef, i16 %res0, i32 0
%res1 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 1, i16 %mask)		%res1 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 1, i16 %mask)
%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1		%vec1 = insertelement <8 x i16> %vec0, i16 %res1, i32 1
%res2 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 2, i16 %mask)		%res2 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 2, i16 %mask)
%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2		%vec2 = insertelement <8 x i16> %vec1, i16 %res2, i32 2
%res3 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 3, i16 %mask)		%res3 = call i16 @llvm.x86.avx512.mask.ucmp.b.128(<16 x i8> %a0, <16 x i8> %a1, i32 3, i16 %mask)
Show All 19 Lines
; CHECK-NEXT: vpcmplew %xmm1, %xmm0, %k5 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xe9,0x02]		; CHECK-NEXT: vpcmplew %xmm1, %xmm0, %k5 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xe9,0x02]
; CHECK-NEXT: vpcmpunordw %xmm1, %xmm0, %k6 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xf1,0x03]		; CHECK-NEXT: vpcmpunordw %xmm1, %xmm0, %k6 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xf1,0x03]
; CHECK-NEXT: vpcmpneqw %xmm1, %xmm0, %k7 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xf9,0x04]		; CHECK-NEXT: vpcmpneqw %xmm1, %xmm0, %k7 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xf9,0x04]
; CHECK-NEXT: vpcmpnltw %xmm1, %xmm0, %k2 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xd1,0x05]		; CHECK-NEXT: vpcmpnltw %xmm1, %xmm0, %k2 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xd1,0x05]
; CHECK-NEXT: vpcmpnlew %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xc9,0x06]		; CHECK-NEXT: vpcmpnlew %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xc9,0x06]
; CHECK-NEXT: vpcmpordw %xmm1, %xmm0, %k0 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xc1,0x07]		; CHECK-NEXT: vpcmpordw %xmm1, %xmm0, %k0 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xc1,0x07]
; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]		; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]
; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]		; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]
; CHECK-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc1,0x00]		; CHECK-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc1,0x00]
; CHECK-NEXT: vpinsrb $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x02]		; CHECK-NEXT: vpinsrb $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x02]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: vpinsrb $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x04]		; CHECK-NEXT: vpinsrb $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x04]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrb $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x06]		; CHECK-NEXT: vpinsrb $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x06]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrb $8, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x08]		; CHECK-NEXT: vpinsrb $8, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x08]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrb $10, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0a]		; CHECK-NEXT: vpinsrb $10, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0a]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrb $12, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0c]		; CHECK-NEXT: vpinsrb $12, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0c]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrb $14, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0e]		; CHECK-NEXT: vpinsrb $14, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0e]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 0, i8 -1)		%res0 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 0, i8 -1)
%vec0 = insertelement <8 x i8> undef, i8 %res0, i32 0		%vec0 = insertelement <8 x i8> undef, i8 %res0, i32 0
%res1 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 1, i8 -1)		%res1 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 1, i8 -1)
%vec1 = insertelement <8 x i8> %vec0, i8 %res1, i32 1		%vec1 = insertelement <8 x i8> %vec0, i8 %res1, i32 1
%res2 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 2, i8 -1)		%res2 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 2, i8 -1)
%vec2 = insertelement <8 x i8> %vec1, i8 %res2, i32 2		%vec2 = insertelement <8 x i8> %vec1, i8 %res2, i32 2
%res3 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 3, i8 -1)		%res3 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 3, i8 -1)
Show All 18 Lines
; CHECK-NEXT: vpcmplew %xmm1, %xmm0, %k6 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xf1,0x02]		; CHECK-NEXT: vpcmplew %xmm1, %xmm0, %k6 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xf1,0x02]
; CHECK-NEXT: vpcmpunordw %xmm1, %xmm0, %k7 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xf9,0x03]		; CHECK-NEXT: vpcmpunordw %xmm1, %xmm0, %k7 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xf9,0x03]
; CHECK-NEXT: vpcmpneqw %xmm1, %xmm0, %k0 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xc1,0x04]		; CHECK-NEXT: vpcmpneqw %xmm1, %xmm0, %k0 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xc1,0x04]
; CHECK-NEXT: vpcmpnltw %xmm1, %xmm0, %k2 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xd1,0x05]		; CHECK-NEXT: vpcmpnltw %xmm1, %xmm0, %k2 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xd1,0x05]
; CHECK-NEXT: vpcmpnlew %xmm1, %xmm0, %k1 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xc9,0x06]		; CHECK-NEXT: vpcmpnlew %xmm1, %xmm0, %k1 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xc9,0x06]
; CHECK-NEXT: vpcmpordw %xmm1, %xmm0, %k3 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xd9,0x07]		; CHECK-NEXT: vpcmpordw %xmm1, %xmm0, %k3 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3f,0xd9,0x07]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]		; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]
; CHECK-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc1,0x00]		; CHECK-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc1,0x00]
; CHECK-NEXT: vpinsrb $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x02]		; CHECK-NEXT: vpinsrb $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x02]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrb $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x04]		; CHECK-NEXT: vpinsrb $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x04]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrb $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x06]		; CHECK-NEXT: vpinsrb $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x06]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrb $8, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x08]		; CHECK-NEXT: vpinsrb $8, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x08]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrb $10, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0a]		; CHECK-NEXT: vpinsrb $10, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0a]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrb $12, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0c]		; CHECK-NEXT: vpinsrb $12, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0c]
; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]		; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]
; CHECK-NEXT: vpinsrb $14, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0e]		; CHECK-NEXT: vpinsrb $14, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0e]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 0, i8 %mask)		%res0 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 0, i8 %mask)
%vec0 = insertelement <8 x i8> undef, i8 %res0, i32 0		%vec0 = insertelement <8 x i8> undef, i8 %res0, i32 0
%res1 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 1, i8 %mask)		%res1 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 1, i8 %mask)
%vec1 = insertelement <8 x i8> %vec0, i8 %res1, i32 1		%vec1 = insertelement <8 x i8> %vec0, i8 %res1, i32 1
%res2 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 2, i8 %mask)		%res2 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 2, i8 %mask)
%vec2 = insertelement <8 x i8> %vec1, i8 %res2, i32 2		%vec2 = insertelement <8 x i8> %vec1, i8 %res2, i32 2
%res3 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 3, i8 %mask)		%res3 = call i8 @llvm.x86.avx512.mask.cmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 3, i8 %mask)
Show All 19 Lines
; CHECK-NEXT: vpcmpleuw %xmm1, %xmm0, %k5 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xe9,0x02]		; CHECK-NEXT: vpcmpleuw %xmm1, %xmm0, %k5 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xe9,0x02]
; CHECK-NEXT: vpcmpunorduw %xmm1, %xmm0, %k6 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xf1,0x03]		; CHECK-NEXT: vpcmpunorduw %xmm1, %xmm0, %k6 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xf1,0x03]
; CHECK-NEXT: vpcmpnequw %xmm1, %xmm0, %k7 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xf9,0x04]		; CHECK-NEXT: vpcmpnequw %xmm1, %xmm0, %k7 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xf9,0x04]
; CHECK-NEXT: vpcmpnltuw %xmm1, %xmm0, %k2 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xd1,0x05]		; CHECK-NEXT: vpcmpnltuw %xmm1, %xmm0, %k2 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xd1,0x05]
; CHECK-NEXT: vpcmpnleuw %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xc9,0x06]		; CHECK-NEXT: vpcmpnleuw %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xc9,0x06]
; CHECK-NEXT: vpcmporduw %xmm1, %xmm0, %k0 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xc1,0x07]		; CHECK-NEXT: vpcmporduw %xmm1, %xmm0, %k0 ## encoding: [0x62,0xf3,0xfd,0x08,0x3e,0xc1,0x07]
; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]		; CHECK-NEXT: kmovw %k4, %eax ## encoding: [0xc5,0xf8,0x93,0xc4]
; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]		; CHECK-NEXT: kmovw %k3, %ecx ## encoding: [0xc5,0xf8,0x93,0xcb]
; CHECK-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc1,0x00]		; CHECK-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc1,0x00]
; CHECK-NEXT: vpinsrb $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x02]		; CHECK-NEXT: vpinsrb $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x02]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: vpinsrb $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x04]		; CHECK-NEXT: vpinsrb $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x04]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrb $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x06]		; CHECK-NEXT: vpinsrb $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x06]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrb $8, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x08]		; CHECK-NEXT: vpinsrb $8, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x08]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrb $10, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0a]		; CHECK-NEXT: vpinsrb $10, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0a]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrb $12, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0c]		; CHECK-NEXT: vpinsrb $12, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0c]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrb $14, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0e]		; CHECK-NEXT: vpinsrb $14, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0e]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 0, i8 -1)		%res0 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 0, i8 -1)
%vec0 = insertelement <8 x i8> undef, i8 %res0, i32 0		%vec0 = insertelement <8 x i8> undef, i8 %res0, i32 0
%res1 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 1, i8 -1)		%res1 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 1, i8 -1)
%vec1 = insertelement <8 x i8> %vec0, i8 %res1, i32 1		%vec1 = insertelement <8 x i8> %vec0, i8 %res1, i32 1
%res2 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 2, i8 -1)		%res2 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 2, i8 -1)
%vec2 = insertelement <8 x i8> %vec1, i8 %res2, i32 2		%vec2 = insertelement <8 x i8> %vec1, i8 %res2, i32 2
%res3 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 3, i8 -1)		%res3 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 3, i8 -1)
Show All 18 Lines
; CHECK-NEXT: vpcmpleuw %xmm1, %xmm0, %k6 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xf1,0x02]		; CHECK-NEXT: vpcmpleuw %xmm1, %xmm0, %k6 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xf1,0x02]
; CHECK-NEXT: vpcmpunorduw %xmm1, %xmm0, %k7 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xf9,0x03]		; CHECK-NEXT: vpcmpunorduw %xmm1, %xmm0, %k7 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xf9,0x03]
; CHECK-NEXT: vpcmpnequw %xmm1, %xmm0, %k0 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xc1,0x04]		; CHECK-NEXT: vpcmpnequw %xmm1, %xmm0, %k0 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xc1,0x04]
; CHECK-NEXT: vpcmpnltuw %xmm1, %xmm0, %k2 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xd1,0x05]		; CHECK-NEXT: vpcmpnltuw %xmm1, %xmm0, %k2 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xd1,0x05]
; CHECK-NEXT: vpcmpnleuw %xmm1, %xmm0, %k1 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xc9,0x06]		; CHECK-NEXT: vpcmpnleuw %xmm1, %xmm0, %k1 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xc9,0x06]
; CHECK-NEXT: vpcmporduw %xmm1, %xmm0, %k3 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xd9,0x07]		; CHECK-NEXT: vpcmporduw %xmm1, %xmm0, %k3 {%k3} ## encoding: [0x62,0xf3,0xfd,0x0b,0x3e,0xd9,0x07]
; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]		; CHECK-NEXT: kmovw %k5, %eax ## encoding: [0xc5,0xf8,0x93,0xc5]
; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]		; CHECK-NEXT: kmovw %k4, %ecx ## encoding: [0xc5,0xf8,0x93,0xcc]
; CHECK-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc1,0x00]		; CHECK-NEXT: vpinsrb $0, %ecx, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc1,0x00]
; CHECK-NEXT: vpinsrb $2, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x02]		; CHECK-NEXT: vpinsrb $2, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x02]
; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]		; CHECK-NEXT: kmovw %k6, %eax ## encoding: [0xc5,0xf8,0x93,0xc6]
; CHECK-NEXT: vpinsrb $4, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x04]		; CHECK-NEXT: vpinsrb $4, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x04]
; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]		; CHECK-NEXT: kmovw %k7, %eax ## encoding: [0xc5,0xf8,0x93,0xc7]
; CHECK-NEXT: vpinsrb $6, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x06]		; CHECK-NEXT: vpinsrb $6, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x06]
; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]		; CHECK-NEXT: kmovw %k0, %eax ## encoding: [0xc5,0xf8,0x93,0xc0]
; CHECK-NEXT: vpinsrb $8, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x08]		; CHECK-NEXT: vpinsrb $8, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x08]
; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]		; CHECK-NEXT: kmovw %k2, %eax ## encoding: [0xc5,0xf8,0x93,0xc2]
; CHECK-NEXT: vpinsrb $10, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0a]		; CHECK-NEXT: vpinsrb $10, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0a]
; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]		; CHECK-NEXT: kmovw %k1, %eax ## encoding: [0xc5,0xf8,0x93,0xc1]
; CHECK-NEXT: vpinsrb $12, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0c]		; CHECK-NEXT: vpinsrb $12, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0c]
; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]		; CHECK-NEXT: kmovw %k3, %eax ## encoding: [0xc5,0xf8,0x93,0xc3]
; CHECK-NEXT: vpinsrb $14, %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x20,0xc0,0x0e]		; CHECK-NEXT: vpinsrb $14, %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x20,0xc0,0x0e]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 0, i8 %mask)		%res0 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 0, i8 %mask)
%vec0 = insertelement <8 x i8> undef, i8 %res0, i32 0		%vec0 = insertelement <8 x i8> undef, i8 %res0, i32 0
%res1 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 1, i8 %mask)		%res1 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 1, i8 %mask)
%vec1 = insertelement <8 x i8> %vec0, i8 %res1, i32 1		%vec1 = insertelement <8 x i8> %vec0, i8 %res1, i32 1
%res2 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 2, i8 %mask)		%res2 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 2, i8 %mask)
%vec2 = insertelement <8 x i8> %vec1, i8 %res2, i32 2		%vec2 = insertelement <8 x i8> %vec1, i8 %res2, i32 2
%res3 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 3, i8 %mask)		%res3 = call i8 @llvm.x86.avx512.mask.ucmp.w.128(<8 x i16> %a0, <8 x i16> %a1, i32 3, i8 %mask)
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a, <2 x double> %b, <2 x double> %c, i8 %mask)		%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a, <2 x double> %b, <2 x double> %c, i8 %mask)
ret <2 x double> %res		ret <2 x double> %res
}		}

define <2 x double>@test_int_x86_avx512_mask_vfmadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_vfmadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd8]		; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd8]
; CHECK-NEXT: vfmadd132pd %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x09,0x98,0xd9]		; CHECK-NEXT: vfmadd132pd %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x09,0x98,0xd9]
; CHECK-NEXT: vfmadd213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xa8,0xca]		; CHECK-NEXT: vfmadd213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa8,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask3.vfmadd.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask3.vfmadd.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask3_vfmadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask3_vfmadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xda]		; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xda]
; CHECK-NEXT: vfmadd231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xb8,0xd9]		; CHECK-NEXT: vfmadd231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xb8,0xd9]
; CHECK-NEXT: vfmadd213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xa8,0xca]		; CHECK-NEXT: vfmadd213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa8,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask3.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask3.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <2 x double> @llvm.x86.avx512.maskz.vfmadd.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.maskz.vfmadd.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_maskz_vfmadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_maskz_vfmadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_pd_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd9]		; CHECK-NEXT: vmovapd %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd9]
; CHECK-NEXT: vfmadd213pd %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0xa8,0xda]		; CHECK-NEXT: vfmadd213pd %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0xa8,0xda]
; CHECK-NEXT: vfmadd213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xa8,0xca]		; CHECK-NEXT: vfmadd213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa8,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.maskz.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.maskz.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.maskz.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.maskz.vfmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

define <4 x double>@test_int_x86_avx512_mask_vfmadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_vfmadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd8]		; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd8]
; CHECK-NEXT: vfmadd132pd %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x98,0xd9]		; CHECK-NEXT: vfmadd132pd %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x98,0xd9]
; CHECK-NEXT: vfmadd213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xa8,0xca]		; CHECK-NEXT: vfmadd213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xa8,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask3.vfmadd.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask3.vfmadd.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask3_vfmadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask3_vfmadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xda]		; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xda]
; CHECK-NEXT: vfmadd231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xb8,0xd9]		; CHECK-NEXT: vfmadd231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xb8,0xd9]
; CHECK-NEXT: vfmadd213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xa8,0xca]		; CHECK-NEXT: vfmadd213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xa8,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask3.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask3.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.maskz.vfmadd.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.maskz.vfmadd.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_maskz_vfmadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_maskz_vfmadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_pd_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd9]		; CHECK-NEXT: vmovapd %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd9]
; CHECK-NEXT: vfmadd213pd %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0xa8,0xda]		; CHECK-NEXT: vfmadd213pd %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0xa8,0xda]
; CHECK-NEXT: vfmadd213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xa8,0xca]		; CHECK-NEXT: vfmadd213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xa8,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.maskz.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.maskz.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.maskz.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.maskz.vfmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

define <4 x float>@test_int_x86_avx512_mask_vfmadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vfmadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd8]		; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd8]
; CHECK-NEXT: vfmadd132ps %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x98,0xd9]		; CHECK-NEXT: vfmadd132ps %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x98,0xd9]
; CHECK-NEXT: vfmadd213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xa8,0xca]		; CHECK-NEXT: vfmadd213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa8,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask3.vfmadd.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask3.vfmadd.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask3_vfmadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask3_vfmadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xda]		; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xda]
; CHECK-NEXT: vfmadd231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb8,0xd9]		; CHECK-NEXT: vfmadd231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb8,0xd9]
; CHECK-NEXT: vfmadd213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xa8,0xca]		; CHECK-NEXT: vfmadd213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa8,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <4 x float> @llvm.x86.avx512.maskz.vfmadd.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.maskz.vfmadd.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_maskz_vfmadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_maskz_vfmadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_ps_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm1, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd9]		; CHECK-NEXT: vmovaps %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd9]
; CHECK-NEXT: vfmadd213ps %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0xa8,0xda]		; CHECK-NEXT: vfmadd213ps %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0xa8,0xda]
; CHECK-NEXT: vfmadd213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xa8,0xca]		; CHECK-NEXT: vfmadd213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa8,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

define <8 x float>@test_int_x86_avx512_mask_vfmadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_vfmadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd8]		; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd8]
; CHECK-NEXT: vfmadd132ps %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x29,0x98,0xd9]		; CHECK-NEXT: vfmadd132ps %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x29,0x98,0xd9]
; CHECK-NEXT: vfmadd213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xa8,0xca]		; CHECK-NEXT: vfmadd213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xa8,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask3.vfmadd.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask3.vfmadd.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask3_vfmadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask3_vfmadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xda]		; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xda]
; CHECK-NEXT: vfmadd231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xb8,0xd9]		; CHECK-NEXT: vfmadd231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xb8,0xd9]
; CHECK-NEXT: vfmadd213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xa8,0xca]		; CHECK-NEXT: vfmadd213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xa8,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask3.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask3.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.maskz.vfmadd.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.maskz.vfmadd.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_maskz_vfmadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_maskz_vfmadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_ps_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm1, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd9]		; CHECK-NEXT: vmovaps %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd9]
; CHECK-NEXT: vfmadd213ps %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0xa8,0xda]		; CHECK-NEXT: vfmadd213ps %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0xa8,0xda]
; CHECK-NEXT: vfmadd213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xa8,0xca]		; CHECK-NEXT: vfmadd213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xa8,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.maskz.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.maskz.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.maskz.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.maskz.vfmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}


declare <2 x double> @llvm.x86.avx512.mask3.vfmsub.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask3.vfmsub.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask3_vfmsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask3_vfmsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsub_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsub_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xda]		; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xda]
; CHECK-NEXT: vfmsub231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xba,0xd9]		; CHECK-NEXT: vfmsub231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xba,0xd9]
; CHECK-NEXT: vfmsub213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xaa,0xca]		; CHECK-NEXT: vfmsub213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xaa,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask3.vfmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask3.vfmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}


declare <4 x double> @llvm.x86.avx512.mask3.vfmsub.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask3.vfmsub.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask3_vfmsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask3_vfmsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsub_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsub_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xda]		; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xda]
; CHECK-NEXT: vfmsub231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xba,0xd9]		; CHECK-NEXT: vfmsub231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xba,0xd9]
; CHECK-NEXT: vfmsub213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xaa,0xca]		; CHECK-NEXT: vfmsub213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xaa,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask3.vfmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask3.vfmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask3.vfmsub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask3.vfmsub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask3_vfmsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask3_vfmsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsub_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xda]		; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xda]
; CHECK-NEXT: vfmsub231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xba,0xd9]		; CHECK-NEXT: vfmsub231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xba,0xd9]
; CHECK-NEXT: vfmsub213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xaa,0xca]		; CHECK-NEXT: vfmsub213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xaa,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask3.vfmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask3.vfmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask3.vfmsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask3.vfmsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask3_vfmsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask3_vfmsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsub_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xda]		; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xda]
; CHECK-NEXT: vfmsub231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xba,0xd9]		; CHECK-NEXT: vfmsub231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xba,0xd9]
; CHECK-NEXT: vfmsub213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xaa,0xca]		; CHECK-NEXT: vfmsub213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xaa,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask3.vfmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask3.vfmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.vfnmadd.ps.256(<8 x float>, <8 x float>, <8 x float>, i8) nounwind readnone		declare <8 x float> @llvm.x86.avx512.mask.vfnmadd.ps.256(<8 x float>, <8 x float>, <8 x float>, i8) nounwind readnone
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x double> %res		ret <2 x double> %res
}		}


define <2 x double>@test_int_x86_avx512_mask_vfnmsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_vfnmsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfnmsub_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vfnmsub_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd8]		; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd8]
; CHECK-NEXT: vfnmsub132pd %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x09,0x9e,0xd9]		; CHECK-NEXT: vfnmsub132pd %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x09,0x9e,0xd9]
; CHECK-NEXT: vfnmsub213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xae,0xca]		; CHECK-NEXT: vfnmsub213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xae,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vfnmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.vfnmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.vfnmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.vfnmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask3_vfnmsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask3_vfnmsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfnmsub_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfnmsub_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xda]		; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xda]
; CHECK-NEXT: vfnmsub231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xbe,0xd9]		; CHECK-NEXT: vfnmsub231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xbe,0xd9]
; CHECK-NEXT: vfnmsub213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xae,0xca]		; CHECK-NEXT: vfnmsub213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xae,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

define <4 x double>@test_int_x86_avx512_mask_vfnmsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_vfnmsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfnmsub_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vfnmsub_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd8]		; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd8]
; CHECK-NEXT: vfnmsub132pd %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x9e,0xd9]		; CHECK-NEXT: vfnmsub132pd %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x9e,0xd9]
; CHECK-NEXT: vfnmsub213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xae,0xca]		; CHECK-NEXT: vfnmsub213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xae,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vfnmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.vfnmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.vfnmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.vfnmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask3_vfnmsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask3_vfnmsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfnmsub_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfnmsub_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xda]		; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xda]
; CHECK-NEXT: vfnmsub231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xbe,0xd9]		; CHECK-NEXT: vfnmsub231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xbe,0xd9]
; CHECK-NEXT: vfnmsub213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xae,0xca]		; CHECK-NEXT: vfnmsub213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xae,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfnmsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

define <4 x float>@test_int_x86_avx512_mask_vfnmsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vfnmsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfnmsub_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vfnmsub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd8]		; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd8]
; CHECK-NEXT: vfnmsub132ps %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x9e,0xd9]		; CHECK-NEXT: vfnmsub132ps %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x9e,0xd9]
; CHECK-NEXT: vfnmsub213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xae,0xca]		; CHECK-NEXT: vfnmsub213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xae,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vfnmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vfnmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.vfnmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.vfnmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask3_vfnmsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask3_vfnmsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfnmsub_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfnmsub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xda]		; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xda]
; CHECK-NEXT: vfnmsub231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xbe,0xd9]		; CHECK-NEXT: vfnmsub231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xbe,0xd9]
; CHECK-NEXT: vfnmsub213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xae,0xca]		; CHECK-NEXT: vfnmsub213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xae,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

define <8 x float>@test_int_x86_avx512_mask_vfnmsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_vfnmsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfnmsub_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vfnmsub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd8]		; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd8]
; CHECK-NEXT: vfnmsub132ps %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x29,0x9e,0xd9]		; CHECK-NEXT: vfnmsub132ps %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x29,0x9e,0xd9]
; CHECK-NEXT: vfnmsub213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xae,0xca]		; CHECK-NEXT: vfnmsub213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xae,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vfnmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.vfnmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.vfnmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.vfnmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask3_vfnmsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask3_vfnmsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfnmsub_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfnmsub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xda]		; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xda]
; CHECK-NEXT: vfnmsub231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xbe,0xd9]		; CHECK-NEXT: vfnmsub231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xbe,0xd9]
; CHECK-NEXT: vfnmsub213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xae,0xca]		; CHECK-NEXT: vfnmsub213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xae,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfnmsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

define <2 x double>@test_int_x86_avx512_mask_vfnmadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_vfnmadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfnmadd_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vfnmadd_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd8]		; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd8]
; CHECK-NEXT: vfnmadd132pd %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x09,0x9c,0xd9]		; CHECK-NEXT: vfnmadd132pd %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x09,0x9c,0xd9]
; CHECK-NEXT: vfnmadd213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xac,0xca]		; CHECK-NEXT: vfnmadd213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xac,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vfnmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.vfnmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.vfnmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.vfnmadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

define <4 x double>@test_int_x86_avx512_mask_vfnmadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_vfnmadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfnmadd_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vfnmadd_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd8]		; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd8]
; CHECK-NEXT: vfnmadd132pd %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x9c,0xd9]		; CHECK-NEXT: vfnmadd132pd %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x9c,0xd9]
; CHECK-NEXT: vfnmadd213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xac,0xca]		; CHECK-NEXT: vfnmadd213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xac,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vfnmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.vfnmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.vfnmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.vfnmadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

define <4 x float>@test_int_x86_avx512_mask_vfnmadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vfnmadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfnmadd_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vfnmadd_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd8]		; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd8]
; CHECK-NEXT: vfnmadd132ps %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x9c,0xd9]		; CHECK-NEXT: vfnmadd132ps %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x9c,0xd9]
; CHECK-NEXT: vfnmadd213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xac,0xca]		; CHECK-NEXT: vfnmadd213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xac,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vfnmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vfnmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.vfnmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.vfnmadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

define <8 x float>@test_int_x86_avx512_mask_vfnmadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_vfnmadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfnmadd_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vfnmadd_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd8]		; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd8]
; CHECK-NEXT: vfnmadd132ps %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x29,0x9c,0xd9]		; CHECK-NEXT: vfnmadd132ps %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x29,0x9c,0xd9]
; CHECK-NEXT: vfnmadd213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xac,0xca]		; CHECK-NEXT: vfnmadd213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xac,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vfnmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.vfnmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.vfnmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.vfnmadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8) nounwind readnone		declare <8 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8) nounwind readnone
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 %mask) nounwind		%res = call <2 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 %mask) nounwind
ret <2 x double> %res		ret <2 x double> %res
}		}

define <2 x double>@test_int_x86_avx512_mask_vfmaddsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_vfmaddsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfmaddsub_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vfmaddsub_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd8]		; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd8]
; CHECK-NEXT: vfmaddsub132pd %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x09,0x96,0xd9]		; CHECK-NEXT: vfmaddsub132pd %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x09,0x96,0xd9]
; CHECK-NEXT: vfmaddsub213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa6,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask3_vfmaddsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask3_vfmaddsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmaddsub_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmaddsub_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xda]		; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xda]
; CHECK-NEXT: vfmaddsub231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xb6,0xd9]		; CHECK-NEXT: vfmaddsub231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xb6,0xd9]
; CHECK-NEXT: vfmaddsub213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa6,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <2 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_maskz_vfmaddsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_maskz_vfmaddsub_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmaddsub_pd_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_vfmaddsub_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd9]		; CHECK-NEXT: vmovapd %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd9]
; CHECK-NEXT: vfmaddsub213pd %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0xa6,0xda]		; CHECK-NEXT: vfmaddsub213pd %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0xa6,0xda]
; CHECK-NEXT: vfmaddsub213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa6,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

define <4 x double>@test_int_x86_avx512_mask_vfmaddsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_vfmaddsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfmaddsub_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vfmaddsub_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd8]		; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd8]
; CHECK-NEXT: vfmaddsub132pd %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x96,0xd9]		; CHECK-NEXT: vfmaddsub132pd %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x96,0xd9]
; CHECK-NEXT: vfmaddsub213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xa6,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask3_vfmaddsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask3_vfmaddsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmaddsub_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmaddsub_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xda]		; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xda]
; CHECK-NEXT: vfmaddsub231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xb6,0xd9]		; CHECK-NEXT: vfmaddsub231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xb6,0xd9]
; CHECK-NEXT: vfmaddsub213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xa6,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_maskz_vfmaddsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_maskz_vfmaddsub_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmaddsub_pd_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_vfmaddsub_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd9]		; CHECK-NEXT: vmovapd %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd9]
; CHECK-NEXT: vfmaddsub213pd %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0xa6,0xda]		; CHECK-NEXT: vfmaddsub213pd %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0xa6,0xda]
; CHECK-NEXT: vfmaddsub213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xa6,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.maskz.vfmaddsub.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

define <4 x float>@test_int_x86_avx512_mask_vfmaddsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vfmaddsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfmaddsub_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vfmaddsub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd8]		; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd8]
; CHECK-NEXT: vfmaddsub132ps %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x96,0xd9]		; CHECK-NEXT: vfmaddsub132ps %xmm1, %xmm2, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x96,0xd9]
; CHECK-NEXT: vfmaddsub213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa6,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask3_vfmaddsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask3_vfmaddsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmaddsub_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmaddsub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xda]		; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xda]
; CHECK-NEXT: vfmaddsub231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb6,0xd9]		; CHECK-NEXT: vfmaddsub231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb6,0xd9]
; CHECK-NEXT: vfmaddsub213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa6,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <4 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_maskz_vfmaddsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_maskz_vfmaddsub_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmaddsub_ps_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_vfmaddsub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm1, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd9]		; CHECK-NEXT: vmovaps %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd9]
; CHECK-NEXT: vfmaddsub213ps %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0xa6,0xda]		; CHECK-NEXT: vfmaddsub213ps %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0xa6,0xda]
; CHECK-NEXT: vfmaddsub213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa6,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

define <8 x float>@test_int_x86_avx512_mask_vfmaddsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_vfmaddsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vfmaddsub_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vfmaddsub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd8]		; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd8]
; CHECK-NEXT: vfmaddsub132ps %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x29,0x96,0xd9]		; CHECK-NEXT: vfmaddsub132ps %ymm1, %ymm2, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x6d,0x29,0x96,0xd9]
; CHECK-NEXT: vfmaddsub213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xa6,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask3_vfmaddsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask3_vfmaddsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmaddsub_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmaddsub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xda]		; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xda]
; CHECK-NEXT: vfmaddsub231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xb6,0xd9]		; CHECK-NEXT: vfmaddsub231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xb6,0xd9]
; CHECK-NEXT: vfmaddsub213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xa6,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_maskz_vfmaddsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_maskz_vfmaddsub_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmaddsub_ps_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_vfmaddsub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm1, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd9]		; CHECK-NEXT: vmovaps %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd9]
; CHECK-NEXT: vfmaddsub213ps %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0xa6,0xda]		; CHECK-NEXT: vfmaddsub213ps %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0xa6,0xda]
; CHECK-NEXT: vfmaddsub213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xa6,0xca]		; CHECK-NEXT: vfmaddsub213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xa6,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.maskz.vfmaddsub.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask3_vfmsubadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask3_vfmsubadd_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsubadd_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsubadd_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xda]		; CHECK-NEXT: vmovapd %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xda]
; CHECK-NEXT: vfmsubadd231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xb7,0xd9]		; CHECK-NEXT: vfmsubadd231pd %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xb7,0xd9]
; CHECK-NEXT: vfmsubadd213pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0xa7,0xca]		; CHECK-NEXT: vfmsubadd213pd %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa7,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2=fadd <2 x double> %res, %res1		%res2=fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask3_vfmsubadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask3_vfmsubadd_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsubadd_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsubadd_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xda]		; CHECK-NEXT: vmovapd %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xda]
; CHECK-NEXT: vfmsubadd231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xb7,0xd9]		; CHECK-NEXT: vfmsubadd231pd %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0xb7,0xd9]
; CHECK-NEXT: vfmsubadd213pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0xa7,0xca]		; CHECK-NEXT: vfmsubadd213pd %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0xa7,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask3.vfmsubadd.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2=fadd <4 x double> %res, %res1		%res2=fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask3_vfmsubadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask3_vfmsubadd_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsubadd_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsubadd_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xda]		; CHECK-NEXT: vmovaps %xmm2, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xda]
; CHECK-NEXT: vfmsubadd231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb7,0xd9]		; CHECK-NEXT: vfmsubadd231ps %xmm1, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb7,0xd9]
; CHECK-NEXT: vfmsubadd213ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0xa7,0xca]		; CHECK-NEXT: vfmsubadd213ps %xmm2, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa7,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2=fadd <4 x float> %res, %res1		%res2=fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask3_vfmsubadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask3_vfmsubadd_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsubadd_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask3_vfmsubadd_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xda]		; CHECK-NEXT: vmovaps %ymm2, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xda]
; CHECK-NEXT: vfmsubadd231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xb7,0xd9]		; CHECK-NEXT: vfmsubadd231ps %ymm1, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0xb7,0xd9]
; CHECK-NEXT: vfmsubadd213ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0xa7,0xca]		; CHECK-NEXT: vfmsubadd213ps %ymm2, %ymm0, %ymm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0xa7,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask3.vfmsubadd.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2=fadd <8 x float> %res, %res1		%res2=fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}


define <4 x float> @test_mask_vfmadd128_ps_r(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 %mask) {		define <4 x float> @test_mask_vfmadd128_ps_r(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 %mask) {
; CHECK-LABEL: test_mask_vfmadd128_ps_r:		; CHECK-LABEL: test_mask_vfmadd128_ps_r:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vfmadd132ps %xmm1, %xmm2, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x98,0xc1]		; CHECK-NEXT: vfmadd132ps %xmm1, %xmm2, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x6d,0x09,0x98,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 %mask) nounwind		%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 %mask) nounwind
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_vfmadd128_ps_rz(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2) {		define <4 x float> @test_mask_vfmadd128_ps_rz(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2) {
; CHECK-LABEL: test_mask_vfmadd128_ps_rz:		; CHECK-LABEL: test_mask_vfmadd128_ps_rz:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vfmadd213ps %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf2,0x75,0x08,0xa8,0xc2]		; CHECK-NEXT: vfmadd213ps %xmm2, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x71,0xa8,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 -1) nounwind		%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 -1) nounwind
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_vfmadd128_ps_rmk(<4 x float> %a0, <4 x float> %a1, <4 x float>* %ptr_a2, i8 %mask) {		define <4 x float> @test_mask_vfmadd128_ps_rmk(<4 x float> %a0, <4 x float> %a1, <4 x float>* %ptr_a2, i8 %mask) {
; CHECK-LABEL: test_mask_vfmadd128_ps_rmk:		; CHECK-LABEL: test_mask_vfmadd128_ps_rmk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 14 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
%a2 = load <4 x float>, <4 x float>* %ptr_a2, align 8		%a2 = load <4 x float>, <4 x float>* %ptr_a2, align 8
%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 %mask) nounwind		%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 %mask) nounwind
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_vfmadd128_ps_rmkz(<4 x float> %a0, <4 x float> %a1, <4 x float>* %ptr_a2) {		define <4 x float> @test_mask_vfmadd128_ps_rmkz(<4 x float> %a0, <4 x float> %a1, <4 x float>* %ptr_a2) {
; CHECK-LABEL: test_mask_vfmadd128_ps_rmkz:		; CHECK-LABEL: test_mask_vfmadd128_ps_rmkz:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vfmadd213ps (%rdi), %xmm1, %xmm0 ## encoding: [0x62,0xf2,0x75,0x08,0xa8,0x07]		; CHECK-NEXT: vfmadd213ps (%rdi), %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x71,0xa8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%a2 = load <4 x float>, <4 x float>* %ptr_a2		%a2 = load <4 x float>, <4 x float>* %ptr_a2
%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 -1) nounwind		%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 -1) nounwind
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_vfmadd128_ps_rmkza(<4 x float> %a0, <4 x float> %a1, <4 x float>* %ptr_a2) {		define <4 x float> @test_mask_vfmadd128_ps_rmkza(<4 x float> %a0, <4 x float> %a1, <4 x float>* %ptr_a2) {
; CHECK-LABEL: test_mask_vfmadd128_ps_rmkza:		; CHECK-LABEL: test_mask_vfmadd128_ps_rmkza:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vfmadd213ps (%rdi), %xmm1, %xmm0 ## encoding: [0x62,0xf2,0x75,0x08,0xa8,0x07]		; CHECK-NEXT: vfmadd213ps (%rdi), %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x71,0xa8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%a2 = load <4 x float>, <4 x float>* %ptr_a2, align 4		%a2 = load <4 x float>, <4 x float>* %ptr_a2, align 4
%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 -1) nounwind		%res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %a2, i8 -1) nounwind
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_vfmadd128_ps_rmb(<4 x float> %a0, <4 x float> %a1, float* %ptr_a2, i8 %mask) {		define <4 x float> @test_mask_vfmadd128_ps_rmb(<4 x float> %a0, <4 x float> %a1, float* %ptr_a2, i8 %mask) {
; CHECK-LABEL: test_mask_vfmadd128_ps_rmb:		; CHECK-LABEL: test_mask_vfmadd128_ps_rmb:
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 %mask) nounwind		%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 %mask) nounwind
ret <2 x double> %res		ret <2 x double> %res
}		}

define <2 x double> @test_mask_vfmadd128_pd_rz(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2) {		define <2 x double> @test_mask_vfmadd128_pd_rz(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2) {
; CHECK-LABEL: test_mask_vfmadd128_pd_rz:		; CHECK-LABEL: test_mask_vfmadd128_pd_rz:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vfmadd213pd %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf2,0xf5,0x08,0xa8,0xc2]		; CHECK-NEXT: vfmadd213pd %xmm2, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf1,0xa8,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 -1) nounwind		%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 -1) nounwind
ret <2 x double> %res		ret <2 x double> %res
}		}

define <2 x double> @test_mask_vfmadd128_pd_rmk(<2 x double> %a0, <2 x double> %a1, <2 x double>* %ptr_a2, i8 %mask) {		define <2 x double> @test_mask_vfmadd128_pd_rmk(<2 x double> %a0, <2 x double> %a1, <2 x double>* %ptr_a2, i8 %mask) {
; CHECK-LABEL: test_mask_vfmadd128_pd_rmk:		; CHECK-LABEL: test_mask_vfmadd128_pd_rmk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vfmadd213pd (%rdi), %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xf5,0x09,0xa8,0x07]		; CHECK-NEXT: vfmadd213pd (%rdi), %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xf5,0x09,0xa8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%a2 = load <2 x double>, <2 x double>* %ptr_a2		%a2 = load <2 x double>, <2 x double>* %ptr_a2
%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 %mask) nounwind		%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 %mask) nounwind
ret <2 x double> %res		ret <2 x double> %res
}		}

define <2 x double> @test_mask_vfmadd128_pd_rmkz(<2 x double> %a0, <2 x double> %a1, <2 x double>* %ptr_a2) {		define <2 x double> @test_mask_vfmadd128_pd_rmkz(<2 x double> %a0, <2 x double> %a1, <2 x double>* %ptr_a2) {
; CHECK-LABEL: test_mask_vfmadd128_pd_rmkz:		; CHECK-LABEL: test_mask_vfmadd128_pd_rmkz:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vfmadd213pd (%rdi), %xmm1, %xmm0 ## encoding: [0x62,0xf2,0xf5,0x08,0xa8,0x07]		; CHECK-NEXT: vfmadd213pd (%rdi), %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf1,0xa8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%a2 = load <2 x double>, <2 x double>* %ptr_a2		%a2 = load <2 x double>, <2 x double>* %ptr_a2
%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 -1) nounwind		%res = call <2 x double> @llvm.x86.avx512.mask.vfmadd.pd.128(<2 x double> %a0, <2 x double> %a1, <2 x double> %a2, i8 -1) nounwind
ret <2 x double> %res		ret <2 x double> %res
}		}

define <4 x double> @test_mask_vfmadd256_pd_r(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 %mask) {		define <4 x double> @test_mask_vfmadd256_pd_r(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 %mask) {
; CHECK-LABEL: test_mask_vfmadd256_pd_r:		; CHECK-LABEL: test_mask_vfmadd256_pd_r:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vfmadd132pd %ymm1, %ymm2, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x98,0xc1]		; CHECK-NEXT: vfmadd132pd %ymm1, %ymm2, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xed,0x29,0x98,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 %mask) nounwind		%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 %mask) nounwind
ret <4 x double> %res		ret <4 x double> %res
}		}

define <4 x double> @test_mask_vfmadd256_pd_rz(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2) {		define <4 x double> @test_mask_vfmadd256_pd_rz(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2) {
; CHECK-LABEL: test_mask_vfmadd256_pd_rz:		; CHECK-LABEL: test_mask_vfmadd256_pd_rz:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vfmadd213pd %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0xf5,0x28,0xa8,0xc2]		; CHECK-NEXT: vfmadd213pd %ymm2, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf5,0xa8,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 -1) nounwind		%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 -1) nounwind
ret <4 x double> %res		ret <4 x double> %res
}		}

define <4 x double> @test_mask_vfmadd256_pd_rmk(<4 x double> %a0, <4 x double> %a1, <4 x double>* %ptr_a2, i8 %mask) {		define <4 x double> @test_mask_vfmadd256_pd_rmk(<4 x double> %a0, <4 x double> %a1, <4 x double>* %ptr_a2, i8 %mask) {
; CHECK-LABEL: test_mask_vfmadd256_pd_rmk:		; CHECK-LABEL: test_mask_vfmadd256_pd_rmk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vfmadd213pd (%rdi), %ymm1, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xf5,0x29,0xa8,0x07]		; CHECK-NEXT: vfmadd213pd (%rdi), %ymm1, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xf5,0x29,0xa8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%a2 = load <4 x double>, <4 x double>* %ptr_a2		%a2 = load <4 x double>, <4 x double>* %ptr_a2
%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 %mask) nounwind		%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 %mask) nounwind
ret <4 x double> %res		ret <4 x double> %res
}		}

define <4 x double> @test_mask_vfmadd256_pd_rmkz(<4 x double> %a0, <4 x double> %a1, <4 x double>* %ptr_a2) {		define <4 x double> @test_mask_vfmadd256_pd_rmkz(<4 x double> %a0, <4 x double> %a1, <4 x double>* %ptr_a2) {
; CHECK-LABEL: test_mask_vfmadd256_pd_rmkz:		; CHECK-LABEL: test_mask_vfmadd256_pd_rmkz:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vfmadd213pd (%rdi), %ymm1, %ymm0 ## encoding: [0x62,0xf2,0xf5,0x28,0xa8,0x07]		; CHECK-NEXT: vfmadd213pd (%rdi), %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf5,0xa8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%a2 = load <4 x double>, <4 x double>* %ptr_a2		%a2 = load <4 x double>, <4 x double>* %ptr_a2
%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 -1) nounwind		%res = call <4 x double> @llvm.x86.avx512.mask.vfmadd.pd.256(<4 x double> %a0, <4 x double> %a1, <4 x double> %a2, i8 -1) nounwind
ret <4 x double> %res		ret <4 x double> %res
}		}

define <8 x i16> @test_mask_packs_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {		define <8 x i16> @test_mask_packs_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: test_mask_packs_epi32_rr_128:		; CHECK-LABEL: test_mask_packs_epi32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6b,0xc1]		; CHECK-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6b,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packs_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_packs_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rrk_128:		; CHECK-LABEL: test_mask_packs_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackssdw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x6b,0xd1]		; CHECK-NEXT: vpackssdw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x6b,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packs_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {		define <8 x i16> @test_mask_packs_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rrkz_128:		; CHECK-LABEL: test_mask_packs_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x6b,0xc1]		; CHECK-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x6b,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packs_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {		define <8 x i16> @test_mask_packs_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_packs_epi32_rm_128:		; CHECK-LABEL: test_mask_packs_epi32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackssdw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6b,0x07]		; CHECK-NEXT: vpackssdw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6b,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packs_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_packs_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rmk_128:		; CHECK-LABEL: test_mask_packs_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackssdw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x6b,0x0f]		; CHECK-NEXT: vpackssdw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x6b,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packs_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_packs_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rmkz_128:		; CHECK-LABEL: test_mask_packs_epi32_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packs_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_packs_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rmbk_128:		; CHECK-LABEL: test_mask_packs_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackssdw (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0x6b,0x0f]		; CHECK-NEXT: vpackssdw (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0x6b,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0
%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer
%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32>, <4 x i32>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.packssdw.128(<4 x i32>, <4 x i32>, <8 x i16>, i8)

define <16 x i16> @test_mask_packs_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {		define <16 x i16> @test_mask_packs_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {
; CHECK-LABEL: test_mask_packs_epi32_rr_256:		; CHECK-LABEL: test_mask_packs_epi32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x6b,0xc1]		; CHECK-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6b,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packs_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_packs_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rrk_256:		; CHECK-LABEL: test_mask_packs_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackssdw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x6b,0xd1]		; CHECK-NEXT: vpackssdw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x6b,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packs_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i16 %mask) {		define <16 x i16> @test_mask_packs_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rrkz_256:		; CHECK-LABEL: test_mask_packs_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x6b,0xc1]		; CHECK-NEXT: vpackssdw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x6b,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packs_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {		define <16 x i16> @test_mask_packs_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_packs_epi32_rm_256:		; CHECK-LABEL: test_mask_packs_epi32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackssdw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x6b,0x07]		; CHECK-NEXT: vpackssdw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6b,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packs_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_packs_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rmk_256:		; CHECK-LABEL: test_mask_packs_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackssdw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x6b,0x0f]		; CHECK-NEXT: vpackssdw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x6b,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packs_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_packs_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rmkz_256:		; CHECK-LABEL: test_mask_packs_epi32_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packs_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_packs_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi32_rmbk_256:		; CHECK-LABEL: test_mask_packs_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackssdw (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0x6b,0x0f]		; CHECK-NEXT: vpackssdw (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0x6b,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0
%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer
%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <16 x i16> %res		ret <16 x i16> %res
}		}

declare <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32>, <8 x i32>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.packssdw.256(<8 x i32>, <8 x i32>, <16 x i16>, i16)

define <16 x i8> @test_mask_packs_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <16 x i8> @test_mask_packs_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_packs_epi16_rr_128:		; CHECK-LABEL: test_mask_packs_epi16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x63,0xc1]		; CHECK-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x63,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packs_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_packs_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi16_rrk_128:		; CHECK-LABEL: test_mask_packs_epi16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpacksswb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x63,0xd1]		; CHECK-NEXT: vpacksswb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x63,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packs_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i16 %mask) {		define <16 x i8> @test_mask_packs_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi16_rrkz_128:		; CHECK-LABEL: test_mask_packs_epi16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x63,0xc1]		; CHECK-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x63,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packs_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <16 x i8> @test_mask_packs_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_packs_epi16_rm_128:		; CHECK-LABEL: test_mask_packs_epi16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpacksswb (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x63,0x07]		; CHECK-NEXT: vpacksswb (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x63,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packs_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_packs_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi16_rmk_128:		; CHECK-LABEL: test_mask_packs_epi16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpacksswb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x63,0x0f]		; CHECK-NEXT: vpacksswb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x63,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packs_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i16 %mask) {		define <16 x i8> @test_mask_packs_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_packs_epi16_rmkz_128:		; CHECK-LABEL: test_mask_packs_epi16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpacksswb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x63,0x07]		; CHECK-NEXT: vpacksswb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x63,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

declare <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16>, <8 x i16>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.packsswb.128(<8 x i16>, <8 x i16>, <16 x i8>, i16)

define <32 x i8> @test_mask_packs_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <32 x i8> @test_mask_packs_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_packs_epi16_rr_256:		; CHECK-LABEL: test_mask_packs_epi16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x63,0xc1]		; CHECK-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x63,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packs_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_packs_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_packs_epi16_rrk_256:		; CHECK-LABEL: test_mask_packs_epi16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpacksswb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x63,0xd1]		; CHECK-NEXT: vpacksswb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x63,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packs_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i32 %mask) {		define <32 x i8> @test_mask_packs_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i32 %mask) {
; CHECK-LABEL: test_mask_packs_epi16_rrkz_256:		; CHECK-LABEL: test_mask_packs_epi16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x63,0xc1]		; CHECK-NEXT: vpacksswb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x63,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packs_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <32 x i8> @test_mask_packs_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_packs_epi16_rm_256:		; CHECK-LABEL: test_mask_packs_epi16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpacksswb (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x63,0x07]		; CHECK-NEXT: vpacksswb (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x63,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packs_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_packs_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_packs_epi16_rmk_256:		; CHECK-LABEL: test_mask_packs_epi16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpacksswb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x63,0x0f]		; CHECK-NEXT: vpacksswb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x63,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packs_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i32 %mask) {		define <32 x i8> @test_mask_packs_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i32 %mask) {
; CHECK-LABEL: test_mask_packs_epi16_rmkz_256:		; CHECK-LABEL: test_mask_packs_epi16_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpacksswb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x63,0x07]		; CHECK-NEXT: vpacksswb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x63,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

declare <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16>, <16 x i16>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.packsswb.256(<16 x i16>, <16 x i16>, <32 x i8>, i32)


define <8 x i16> @test_mask_packus_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {		define <8 x i16> @test_mask_packus_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: test_mask_packus_epi32_rr_128:		; CHECK-LABEL: test_mask_packus_epi32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x2b,0xc1]		; CHECK-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x2b,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packus_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_packus_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rrk_128:		; CHECK-LABEL: test_mask_packus_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackusdw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x2b,0xd1]		; CHECK-NEXT: vpackusdw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x2b,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packus_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {		define <8 x i16> @test_mask_packus_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rrkz_128:		; CHECK-LABEL: test_mask_packus_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x2b,0xc1]		; CHECK-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x2b,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packus_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {		define <8 x i16> @test_mask_packus_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_packus_epi32_rm_128:		; CHECK-LABEL: test_mask_packus_epi32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackusdw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x2b,0x07]		; CHECK-NEXT: vpackusdw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x2b,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packus_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_packus_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rmk_128:		; CHECK-LABEL: test_mask_packus_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackusdw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x2b,0x0f]		; CHECK-NEXT: vpackusdw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x2b,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packus_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_packus_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rmkz_128:		; CHECK-LABEL: test_mask_packus_epi32_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_packus_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_packus_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rmbk_128:		; CHECK-LABEL: test_mask_packus_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackusdw (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x19,0x2b,0x0f]		; CHECK-NEXT: vpackusdw (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x19,0x2b,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0
%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer
%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32> %a, <4 x i32> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32>, <4 x i32>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.packusdw.128(<4 x i32>, <4 x i32>, <8 x i16>, i8)

define <16 x i16> @test_mask_packus_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {		define <16 x i16> @test_mask_packus_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {
; CHECK-LABEL: test_mask_packus_epi32_rr_256:		; CHECK-LABEL: test_mask_packus_epi32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x2b,0xc1]		; CHECK-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x2b,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packus_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_packus_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rrk_256:		; CHECK-LABEL: test_mask_packus_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackusdw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x2b,0xd1]		; CHECK-NEXT: vpackusdw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x2b,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packus_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i16 %mask) {		define <16 x i16> @test_mask_packus_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rrkz_256:		; CHECK-LABEL: test_mask_packus_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x2b,0xc1]		; CHECK-NEXT: vpackusdw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x2b,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packus_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {		define <16 x i16> @test_mask_packus_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_packus_epi32_rm_256:		; CHECK-LABEL: test_mask_packus_epi32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackusdw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x2b,0x07]		; CHECK-NEXT: vpackusdw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x2b,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packus_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_packus_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rmk_256:		; CHECK-LABEL: test_mask_packus_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackusdw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x2b,0x0f]		; CHECK-NEXT: vpackusdw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x2b,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packus_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_packus_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rmkz_256:		; CHECK-LABEL: test_mask_packus_epi32_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_packus_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_packus_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi32_rmbk_256:		; CHECK-LABEL: test_mask_packus_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackusdw (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x39,0x2b,0x0f]		; CHECK-NEXT: vpackusdw (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x39,0x2b,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0
%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer
%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32> %a, <8 x i32> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <16 x i16> %res		ret <16 x i16> %res
}		}

declare <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32>, <8 x i32>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.packusdw.256(<8 x i32>, <8 x i32>, <16 x i16>, i16)

define <16 x i8> @test_mask_packus_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <16 x i8> @test_mask_packus_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_packus_epi16_rr_128:		; CHECK-LABEL: test_mask_packus_epi16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x67,0xc1]		; CHECK-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x67,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packus_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_packus_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi16_rrk_128:		; CHECK-LABEL: test_mask_packus_epi16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackuswb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x67,0xd1]		; CHECK-NEXT: vpackuswb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x67,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packus_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i16 %mask) {		define <16 x i8> @test_mask_packus_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi16_rrkz_128:		; CHECK-LABEL: test_mask_packus_epi16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x67,0xc1]		; CHECK-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x67,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packus_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <16 x i8> @test_mask_packus_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_packus_epi16_rm_128:		; CHECK-LABEL: test_mask_packus_epi16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackuswb (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x67,0x07]		; CHECK-NEXT: vpackuswb (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x67,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packus_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_packus_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi16_rmk_128:		; CHECK-LABEL: test_mask_packus_epi16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackuswb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x67,0x0f]		; CHECK-NEXT: vpackuswb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x67,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_packus_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i16 %mask) {		define <16 x i8> @test_mask_packus_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_packus_epi16_rmkz_128:		; CHECK-LABEL: test_mask_packus_epi16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpackuswb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x67,0x07]		; CHECK-NEXT: vpackuswb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x67,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16> %a, <8 x i16> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

declare <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16>, <8 x i16>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.packuswb.128(<8 x i16>, <8 x i16>, <16 x i8>, i16)

define <32 x i8> @test_mask_packus_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <32 x i8> @test_mask_packus_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_packus_epi16_rr_256:		; CHECK-LABEL: test_mask_packus_epi16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x67,0xc1]		; CHECK-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x67,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packus_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_packus_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_packus_epi16_rrk_256:		; CHECK-LABEL: test_mask_packus_epi16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpackuswb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x67,0xd1]		; CHECK-NEXT: vpackuswb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x67,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packus_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i32 %mask) {		define <32 x i8> @test_mask_packus_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i32 %mask) {
; CHECK-LABEL: test_mask_packus_epi16_rrkz_256:		; CHECK-LABEL: test_mask_packus_epi16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x67,0xc1]		; CHECK-NEXT: vpackuswb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x67,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packus_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <32 x i8> @test_mask_packus_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_packus_epi16_rm_256:		; CHECK-LABEL: test_mask_packus_epi16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpackuswb (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x67,0x07]		; CHECK-NEXT: vpackuswb (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x67,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packus_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_packus_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_packus_epi16_rmk_256:		; CHECK-LABEL: test_mask_packus_epi16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpackuswb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x67,0x0f]		; CHECK-NEXT: vpackuswb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x67,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_packus_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i32 %mask) {		define <32 x i8> @test_mask_packus_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i32 %mask) {
; CHECK-LABEL: test_mask_packus_epi16_rmkz_256:		; CHECK-LABEL: test_mask_packus_epi16_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpackuswb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x67,0x07]		; CHECK-NEXT: vpackuswb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x67,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16> %a, <16 x i16> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

declare <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16>, <16 x i16>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.packuswb.256(<16 x i16>, <16 x i16>, <32 x i8>, i32)

define <8 x i16> @test_mask_adds_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @test_mask_adds_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_adds_epi16_rr_128:		; CHECK-LABEL: test_mask_adds_epi16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xed,0xc1]		; CHECK-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xed,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_adds_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_adds_epi16_rrk_128:		; CHECK-LABEL: test_mask_adds_epi16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xed,0xd1]		; CHECK-NEXT: vpaddsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xed,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {		define <8 x i16> @test_mask_adds_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {
; CHECK-LABEL: test_mask_adds_epi16_rrkz_128:		; CHECK-LABEL: test_mask_adds_epi16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xed,0xc1]		; CHECK-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xed,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <8 x i16> @test_mask_adds_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_adds_epi16_rm_128:		; CHECK-LABEL: test_mask_adds_epi16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddsw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xed,0x07]		; CHECK-NEXT: vpaddsw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xed,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_adds_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_adds_epi16_rmk_128:		; CHECK-LABEL: test_mask_adds_epi16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddsw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xed,0x0f]		; CHECK-NEXT: vpaddsw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xed,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_adds_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_adds_epi16_rmkz_128:		; CHECK-LABEL: test_mask_adds_epi16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddsw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xed,0x07]		; CHECK-NEXT: vpaddsw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xed,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.padds.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <16 x i16> @test_mask_adds_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <16 x i16> @test_mask_adds_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_adds_epi16_rr_256:		; CHECK-LABEL: test_mask_adds_epi16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xed,0xc1]		; CHECK-NEXT: vpaddsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xed,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_adds_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epi16_rrk_256:		; CHECK-LABEL: test_mask_adds_epi16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xed,0xd1]		; CHECK-NEXT: vpaddsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xed,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {		define <16 x i16> @test_mask_adds_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epi16_rrkz_256:		; CHECK-LABEL: test_mask_adds_epi16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddsw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xed,0xc1]		; CHECK-NEXT: vpaddsw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xed,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <16 x i16> @test_mask_adds_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_adds_epi16_rm_256:		; CHECK-LABEL: test_mask_adds_epi16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddsw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xed,0x07]		; CHECK-NEXT: vpaddsw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xed,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_adds_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epi16_rmk_256:		; CHECK-LABEL: test_mask_adds_epi16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddsw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xed,0x0f]		; CHECK-NEXT: vpaddsw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xed,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_adds_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epi16_rmkz_256:		; CHECK-LABEL: test_mask_adds_epi16_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddsw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xed,0x07]		; CHECK-NEXT: vpaddsw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xed,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

declare <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.padds.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <8 x i16> @test_mask_subs_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @test_mask_subs_epi16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_subs_epi16_rr_128:		; CHECK-LABEL: test_mask_subs_epi16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe9,0xc1]		; CHECK-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_subs_epi16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_subs_epi16_rrk_128:		; CHECK-LABEL: test_mask_subs_epi16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe9,0xd1]		; CHECK-NEXT: vpsubsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe9,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {		define <8 x i16> @test_mask_subs_epi16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {
; CHECK-LABEL: test_mask_subs_epi16_rrkz_128:		; CHECK-LABEL: test_mask_subs_epi16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe9,0xc1]		; CHECK-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <8 x i16> @test_mask_subs_epi16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_subs_epi16_rm_128:		; CHECK-LABEL: test_mask_subs_epi16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubsw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe9,0x07]		; CHECK-NEXT: vpsubsw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_subs_epi16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_subs_epi16_rmk_128:		; CHECK-LABEL: test_mask_subs_epi16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubsw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe9,0x0f]		; CHECK-NEXT: vpsubsw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe9,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_subs_epi16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_subs_epi16_rmkz_128:		; CHECK-LABEL: test_mask_subs_epi16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubsw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe9,0x07]		; CHECK-NEXT: vpsubsw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psubs.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <16 x i16> @test_mask_subs_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <16 x i16> @test_mask_subs_epi16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_subs_epi16_rr_256:		; CHECK-LABEL: test_mask_subs_epi16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe9,0xc1]		; CHECK-NEXT: vpsubsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_subs_epi16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epi16_rrk_256:		; CHECK-LABEL: test_mask_subs_epi16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe9,0xd1]		; CHECK-NEXT: vpsubsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe9,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {		define <16 x i16> @test_mask_subs_epi16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epi16_rrkz_256:		; CHECK-LABEL: test_mask_subs_epi16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubsw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe9,0xc1]		; CHECK-NEXT: vpsubsw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <16 x i16> @test_mask_subs_epi16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_subs_epi16_rm_256:		; CHECK-LABEL: test_mask_subs_epi16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubsw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe9,0x07]		; CHECK-NEXT: vpsubsw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_subs_epi16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epi16_rmk_256:		; CHECK-LABEL: test_mask_subs_epi16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubsw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe9,0x0f]		; CHECK-NEXT: vpsubsw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe9,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_subs_epi16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epi16_rmkz_256:		; CHECK-LABEL: test_mask_subs_epi16_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubsw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe9,0x07]		; CHECK-NEXT: vpsubsw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psubs.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <8 x i16> @test_mask_adds_epu16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @test_mask_adds_epu16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_adds_epu16_rr_128:		; CHECK-LABEL: test_mask_adds_epu16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdd,0xc1]		; CHECK-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epu16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_adds_epu16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_adds_epu16_rrk_128:		; CHECK-LABEL: test_mask_adds_epu16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddusw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdd,0xd1]		; CHECK-NEXT: vpaddusw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdd,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epu16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {		define <8 x i16> @test_mask_adds_epu16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {
; CHECK-LABEL: test_mask_adds_epu16_rrkz_128:		; CHECK-LABEL: test_mask_adds_epu16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdd,0xc1]		; CHECK-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epu16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <8 x i16> @test_mask_adds_epu16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_adds_epu16_rm_128:		; CHECK-LABEL: test_mask_adds_epu16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddusw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdd,0x07]		; CHECK-NEXT: vpaddusw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdd,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epu16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_adds_epu16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_adds_epu16_rmk_128:		; CHECK-LABEL: test_mask_adds_epu16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddusw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdd,0x0f]		; CHECK-NEXT: vpaddusw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdd,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_adds_epu16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_adds_epu16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_adds_epu16_rmkz_128:		; CHECK-LABEL: test_mask_adds_epu16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddusw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdd,0x07]		; CHECK-NEXT: vpaddusw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdd,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.paddus.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <16 x i16> @test_mask_adds_epu16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <16 x i16> @test_mask_adds_epu16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_adds_epu16_rr_256:		; CHECK-LABEL: test_mask_adds_epu16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddusw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xdd,0xc1]		; CHECK-NEXT: vpaddusw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xdd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epu16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_adds_epu16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epu16_rrk_256:		; CHECK-LABEL: test_mask_adds_epu16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddusw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdd,0xd1]		; CHECK-NEXT: vpaddusw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdd,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epu16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {		define <16 x i16> @test_mask_adds_epu16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epu16_rrkz_256:		; CHECK-LABEL: test_mask_adds_epu16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddusw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdd,0xc1]		; CHECK-NEXT: vpaddusw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epu16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <16 x i16> @test_mask_adds_epu16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_adds_epu16_rm_256:		; CHECK-LABEL: test_mask_adds_epu16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddusw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xdd,0x07]		; CHECK-NEXT: vpaddusw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xdd,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epu16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_adds_epu16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epu16_rmk_256:		; CHECK-LABEL: test_mask_adds_epu16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddusw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdd,0x0f]		; CHECK-NEXT: vpaddusw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdd,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_adds_epu16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_adds_epu16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epu16_rmkz_256:		; CHECK-LABEL: test_mask_adds_epu16_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddusw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdd,0x07]		; CHECK-NEXT: vpaddusw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdd,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

declare <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.paddus.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <8 x i16> @test_mask_subs_epu16_rr_128(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @test_mask_subs_epu16_rr_128(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: test_mask_subs_epu16_rr_128:		; CHECK-LABEL: test_mask_subs_epu16_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd9,0xc1]		; CHECK-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epu16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_subs_epu16_rrk_128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_subs_epu16_rrk_128:		; CHECK-LABEL: test_mask_subs_epu16_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubusw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd9,0xd1]		; CHECK-NEXT: vpsubusw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd9,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epu16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {		define <8 x i16> @test_mask_subs_epu16_rrkz_128(<8 x i16> %a, <8 x i16> %b, i8 %mask) {
; CHECK-LABEL: test_mask_subs_epu16_rrkz_128:		; CHECK-LABEL: test_mask_subs_epu16_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd9,0xc1]		; CHECK-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epu16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {		define <8 x i16> @test_mask_subs_epu16_rm_128(<8 x i16> %a, <8 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_subs_epu16_rm_128:		; CHECK-LABEL: test_mask_subs_epu16_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubusw (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd9,0x07]		; CHECK-NEXT: vpsubusw (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 -1)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epu16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {		define <8 x i16> @test_mask_subs_epu16_rmk_128(<8 x i16> %a, <8 x i16>* %ptr_b, <8 x i16> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_subs_epu16_rmk_128:		; CHECK-LABEL: test_mask_subs_epu16_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubusw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd9,0x0f]		; CHECK-NEXT: vpsubusw (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd9,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> %passThru, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

define <8 x i16> @test_mask_subs_epu16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {		define <8 x i16> @test_mask_subs_epu16_rmkz_128(<8 x i16> %a, <8 x i16>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_subs_epu16_rmkz_128:		; CHECK-LABEL: test_mask_subs_epu16_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubusw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd9,0x07]		; CHECK-NEXT: vpsubusw (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i16>, <8 x i16>* %ptr_b		%b = load <8 x i16>, <8 x i16>* %ptr_b
%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)		%res = call <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16> %a, <8 x i16> %b, <8 x i16> zeroinitializer, i8 %mask)
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psubus.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <16 x i16> @test_mask_subs_epu16_rr_256(<16 x i16> %a, <16 x i16> %b) {		define <16 x i16> @test_mask_subs_epu16_rr_256(<16 x i16> %a, <16 x i16> %b) {
; CHECK-LABEL: test_mask_subs_epu16_rr_256:		; CHECK-LABEL: test_mask_subs_epu16_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubusw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd9,0xc1]		; CHECK-NEXT: vpsubusw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epu16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_subs_epu16_rrk_256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epu16_rrk_256:		; CHECK-LABEL: test_mask_subs_epu16_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubusw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd9,0xd1]		; CHECK-NEXT: vpsubusw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd9,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epu16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {		define <16 x i16> @test_mask_subs_epu16_rrkz_256(<16 x i16> %a, <16 x i16> %b, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epu16_rrkz_256:		; CHECK-LABEL: test_mask_subs_epu16_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubusw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd9,0xc1]		; CHECK-NEXT: vpsubusw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd9,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epu16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {		define <16 x i16> @test_mask_subs_epu16_rm_256(<16 x i16> %a, <16 x i16>* %ptr_b) {
; CHECK-LABEL: test_mask_subs_epu16_rm_256:		; CHECK-LABEL: test_mask_subs_epu16_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubusw (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd9,0x07]		; CHECK-NEXT: vpsubusw (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 -1)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epu16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {		define <16 x i16> @test_mask_subs_epu16_rmk_256(<16 x i16> %a, <16 x i16>* %ptr_b, <16 x i16> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epu16_rmk_256:		; CHECK-LABEL: test_mask_subs_epu16_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubusw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd9,0x0f]		; CHECK-NEXT: vpsubusw (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd9,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> %passThru, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

define <16 x i16> @test_mask_subs_epu16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {		define <16 x i16> @test_mask_subs_epu16_rmkz_256(<16 x i16> %a, <16 x i16>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epu16_rmkz_256:		; CHECK-LABEL: test_mask_subs_epu16_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubusw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd9,0x07]		; CHECK-NEXT: vpsubusw (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd9,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i16>, <16 x i16>* %ptr_b		%b = load <16 x i16>, <16 x i16>* %ptr_b
%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)		%res = call <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16> %a, <16 x i16> %b, <16 x i16> zeroinitializer, i16 %mask)
ret <16 x i16> %res		ret <16 x i16> %res
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psubus.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i8> @test_mask_adds_epi8_rr_128(<16 x i8> %a, <16 x i8> %b) {		define <16 x i8> @test_mask_adds_epi8_rr_128(<16 x i8> %a, <16 x i8> %b) {
; CHECK-LABEL: test_mask_adds_epi8_rr_128:		; CHECK-LABEL: test_mask_adds_epi8_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xec,0xc1]		; CHECK-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xec,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epi8_rrk_128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_adds_epi8_rrk_128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epi8_rrk_128:		; CHECK-LABEL: test_mask_adds_epi8_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddsb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xec,0xd1]		; CHECK-NEXT: vpaddsb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xec,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epi8_rrkz_128(<16 x i8> %a, <16 x i8> %b, i16 %mask) {		define <16 x i8> @test_mask_adds_epi8_rrkz_128(<16 x i8> %a, <16 x i8> %b, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epi8_rrkz_128:		; CHECK-LABEL: test_mask_adds_epi8_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xec,0xc1]		; CHECK-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xec,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epi8_rm_128(<16 x i8> %a, <16 x i8>* %ptr_b) {		define <16 x i8> @test_mask_adds_epi8_rm_128(<16 x i8> %a, <16 x i8>* %ptr_b) {
; CHECK-LABEL: test_mask_adds_epi8_rm_128:		; CHECK-LABEL: test_mask_adds_epi8_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddsb (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xec,0x07]		; CHECK-NEXT: vpaddsb (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xec,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epi8_rmk_128(<16 x i8> %a, <16 x i8>* %ptr_b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_adds_epi8_rmk_128(<16 x i8> %a, <16 x i8>* %ptr_b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epi8_rmk_128:		; CHECK-LABEL: test_mask_adds_epi8_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddsb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xec,0x0f]		; CHECK-NEXT: vpaddsb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xec,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epi8_rmkz_128(<16 x i8> %a, <16 x i8>* %ptr_b, i16 %mask) {		define <16 x i8> @test_mask_adds_epi8_rmkz_128(<16 x i8> %a, <16 x i8>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epi8_rmkz_128:		; CHECK-LABEL: test_mask_adds_epi8_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddsb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xec,0x07]		; CHECK-NEXT: vpaddsb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xec,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

declare <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.padds.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <32 x i8> @test_mask_adds_epi8_rr_256(<32 x i8> %a, <32 x i8> %b) {		define <32 x i8> @test_mask_adds_epi8_rr_256(<32 x i8> %a, <32 x i8> %b) {
; CHECK-LABEL: test_mask_adds_epi8_rr_256:		; CHECK-LABEL: test_mask_adds_epi8_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddsb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xec,0xc1]		; CHECK-NEXT: vpaddsb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xec,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epi8_rrk_256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_adds_epi8_rrk_256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_adds_epi8_rrk_256:		; CHECK-LABEL: test_mask_adds_epi8_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpaddsb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xec,0xd1]		; CHECK-NEXT: vpaddsb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xec,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epi8_rrkz_256(<32 x i8> %a, <32 x i8> %b, i32 %mask) {		define <32 x i8> @test_mask_adds_epi8_rrkz_256(<32 x i8> %a, <32 x i8> %b, i32 %mask) {
; CHECK-LABEL: test_mask_adds_epi8_rrkz_256:		; CHECK-LABEL: test_mask_adds_epi8_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpaddsb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xec,0xc1]		; CHECK-NEXT: vpaddsb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xec,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epi8_rm_256(<32 x i8> %a, <32 x i8>* %ptr_b) {		define <32 x i8> @test_mask_adds_epi8_rm_256(<32 x i8> %a, <32 x i8>* %ptr_b) {
; CHECK-LABEL: test_mask_adds_epi8_rm_256:		; CHECK-LABEL: test_mask_adds_epi8_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddsb (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xec,0x07]		; CHECK-NEXT: vpaddsb (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xec,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epi8_rmk_256(<32 x i8> %a, <32 x i8>* %ptr_b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_adds_epi8_rmk_256(<32 x i8> %a, <32 x i8>* %ptr_b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_adds_epi8_rmk_256:		; CHECK-LABEL: test_mask_adds_epi8_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpaddsb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xec,0x0f]		; CHECK-NEXT: vpaddsb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xec,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epi8_rmkz_256(<32 x i8> %a, <32 x i8>* %ptr_b, i32 %mask) {		define <32 x i8> @test_mask_adds_epi8_rmkz_256(<32 x i8> %a, <32 x i8>* %ptr_b, i32 %mask) {
; CHECK-LABEL: test_mask_adds_epi8_rmkz_256:		; CHECK-LABEL: test_mask_adds_epi8_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpaddsb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xec,0x07]		; CHECK-NEXT: vpaddsb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xec,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

declare <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.padds.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <16 x i8> @test_mask_subs_epi8_rr_128(<16 x i8> %a, <16 x i8> %b) {		define <16 x i8> @test_mask_subs_epi8_rr_128(<16 x i8> %a, <16 x i8> %b) {
; CHECK-LABEL: test_mask_subs_epi8_rr_128:		; CHECK-LABEL: test_mask_subs_epi8_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe8,0xc1]		; CHECK-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe8,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epi8_rrk_128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_subs_epi8_rrk_128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epi8_rrk_128:		; CHECK-LABEL: test_mask_subs_epi8_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubsb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe8,0xd1]		; CHECK-NEXT: vpsubsb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe8,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epi8_rrkz_128(<16 x i8> %a, <16 x i8> %b, i16 %mask) {		define <16 x i8> @test_mask_subs_epi8_rrkz_128(<16 x i8> %a, <16 x i8> %b, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epi8_rrkz_128:		; CHECK-LABEL: test_mask_subs_epi8_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe8,0xc1]		; CHECK-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe8,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epi8_rm_128(<16 x i8> %a, <16 x i8>* %ptr_b) {		define <16 x i8> @test_mask_subs_epi8_rm_128(<16 x i8> %a, <16 x i8>* %ptr_b) {
; CHECK-LABEL: test_mask_subs_epi8_rm_128:		; CHECK-LABEL: test_mask_subs_epi8_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubsb (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe8,0x07]		; CHECK-NEXT: vpsubsb (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epi8_rmk_128(<16 x i8> %a, <16 x i8>* %ptr_b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_subs_epi8_rmk_128(<16 x i8> %a, <16 x i8>* %ptr_b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epi8_rmk_128:		; CHECK-LABEL: test_mask_subs_epi8_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubsb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe8,0x0f]		; CHECK-NEXT: vpsubsb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe8,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epi8_rmkz_128(<16 x i8> %a, <16 x i8>* %ptr_b, i16 %mask) {		define <16 x i8> @test_mask_subs_epi8_rmkz_128(<16 x i8> %a, <16 x i8>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epi8_rmkz_128:		; CHECK-LABEL: test_mask_subs_epi8_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubsb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe8,0x07]		; CHECK-NEXT: vpsubsb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

declare <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.psubs.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <32 x i8> @test_mask_subs_epi8_rr_256(<32 x i8> %a, <32 x i8> %b) {		define <32 x i8> @test_mask_subs_epi8_rr_256(<32 x i8> %a, <32 x i8> %b) {
; CHECK-LABEL: test_mask_subs_epi8_rr_256:		; CHECK-LABEL: test_mask_subs_epi8_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubsb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe8,0xc1]		; CHECK-NEXT: vpsubsb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe8,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epi8_rrk_256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_subs_epi8_rrk_256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_subs_epi8_rrk_256:		; CHECK-LABEL: test_mask_subs_epi8_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpsubsb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe8,0xd1]		; CHECK-NEXT: vpsubsb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe8,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epi8_rrkz_256(<32 x i8> %a, <32 x i8> %b, i32 %mask) {		define <32 x i8> @test_mask_subs_epi8_rrkz_256(<32 x i8> %a, <32 x i8> %b, i32 %mask) {
; CHECK-LABEL: test_mask_subs_epi8_rrkz_256:		; CHECK-LABEL: test_mask_subs_epi8_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpsubsb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe8,0xc1]		; CHECK-NEXT: vpsubsb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe8,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epi8_rm_256(<32 x i8> %a, <32 x i8>* %ptr_b) {		define <32 x i8> @test_mask_subs_epi8_rm_256(<32 x i8> %a, <32 x i8>* %ptr_b) {
; CHECK-LABEL: test_mask_subs_epi8_rm_256:		; CHECK-LABEL: test_mask_subs_epi8_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubsb (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe8,0x07]		; CHECK-NEXT: vpsubsb (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epi8_rmk_256(<32 x i8> %a, <32 x i8>* %ptr_b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_subs_epi8_rmk_256(<32 x i8> %a, <32 x i8>* %ptr_b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_subs_epi8_rmk_256:		; CHECK-LABEL: test_mask_subs_epi8_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpsubsb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe8,0x0f]		; CHECK-NEXT: vpsubsb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe8,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epi8_rmkz_256(<32 x i8> %a, <32 x i8>* %ptr_b, i32 %mask) {		define <32 x i8> @test_mask_subs_epi8_rmkz_256(<32 x i8> %a, <32 x i8>* %ptr_b, i32 %mask) {
; CHECK-LABEL: test_mask_subs_epi8_rmkz_256:		; CHECK-LABEL: test_mask_subs_epi8_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpsubsb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe8,0x07]		; CHECK-NEXT: vpsubsb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

declare <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.psubs.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <16 x i8> @test_mask_adds_epu8_rr_128(<16 x i8> %a, <16 x i8> %b) {		define <16 x i8> @test_mask_adds_epu8_rr_128(<16 x i8> %a, <16 x i8> %b) {
; CHECK-LABEL: test_mask_adds_epu8_rr_128:		; CHECK-LABEL: test_mask_adds_epu8_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdc,0xc1]		; CHECK-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdc,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epu8_rrk_128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_adds_epu8_rrk_128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epu8_rrk_128:		; CHECK-LABEL: test_mask_adds_epu8_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddusb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdc,0xd1]		; CHECK-NEXT: vpaddusb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdc,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epu8_rrkz_128(<16 x i8> %a, <16 x i8> %b, i16 %mask) {		define <16 x i8> @test_mask_adds_epu8_rrkz_128(<16 x i8> %a, <16 x i8> %b, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epu8_rrkz_128:		; CHECK-LABEL: test_mask_adds_epu8_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdc,0xc1]		; CHECK-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdc,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epu8_rm_128(<16 x i8> %a, <16 x i8>* %ptr_b) {		define <16 x i8> @test_mask_adds_epu8_rm_128(<16 x i8> %a, <16 x i8>* %ptr_b) {
; CHECK-LABEL: test_mask_adds_epu8_rm_128:		; CHECK-LABEL: test_mask_adds_epu8_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddusb (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdc,0x07]		; CHECK-NEXT: vpaddusb (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdc,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epu8_rmk_128(<16 x i8> %a, <16 x i8>* %ptr_b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_adds_epu8_rmk_128(<16 x i8> %a, <16 x i8>* %ptr_b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epu8_rmk_128:		; CHECK-LABEL: test_mask_adds_epu8_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddusb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdc,0x0f]		; CHECK-NEXT: vpaddusb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdc,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_adds_epu8_rmkz_128(<16 x i8> %a, <16 x i8>* %ptr_b, i16 %mask) {		define <16 x i8> @test_mask_adds_epu8_rmkz_128(<16 x i8> %a, <16 x i8>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_adds_epu8_rmkz_128:		; CHECK-LABEL: test_mask_adds_epu8_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddusb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdc,0x07]		; CHECK-NEXT: vpaddusb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdc,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

declare <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.paddus.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <32 x i8> @test_mask_adds_epu8_rr_256(<32 x i8> %a, <32 x i8> %b) {		define <32 x i8> @test_mask_adds_epu8_rr_256(<32 x i8> %a, <32 x i8> %b) {
; CHECK-LABEL: test_mask_adds_epu8_rr_256:		; CHECK-LABEL: test_mask_adds_epu8_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddusb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xdc,0xc1]		; CHECK-NEXT: vpaddusb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xdc,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epu8_rrk_256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_adds_epu8_rrk_256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_adds_epu8_rrk_256:		; CHECK-LABEL: test_mask_adds_epu8_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpaddusb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdc,0xd1]		; CHECK-NEXT: vpaddusb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdc,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epu8_rrkz_256(<32 x i8> %a, <32 x i8> %b, i32 %mask) {		define <32 x i8> @test_mask_adds_epu8_rrkz_256(<32 x i8> %a, <32 x i8> %b, i32 %mask) {
; CHECK-LABEL: test_mask_adds_epu8_rrkz_256:		; CHECK-LABEL: test_mask_adds_epu8_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpaddusb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdc,0xc1]		; CHECK-NEXT: vpaddusb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdc,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epu8_rm_256(<32 x i8> %a, <32 x i8>* %ptr_b) {		define <32 x i8> @test_mask_adds_epu8_rm_256(<32 x i8> %a, <32 x i8>* %ptr_b) {
; CHECK-LABEL: test_mask_adds_epu8_rm_256:		; CHECK-LABEL: test_mask_adds_epu8_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddusb (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xdc,0x07]		; CHECK-NEXT: vpaddusb (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xdc,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epu8_rmk_256(<32 x i8> %a, <32 x i8>* %ptr_b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_adds_epu8_rmk_256(<32 x i8> %a, <32 x i8>* %ptr_b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_adds_epu8_rmk_256:		; CHECK-LABEL: test_mask_adds_epu8_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpaddusb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdc,0x0f]		; CHECK-NEXT: vpaddusb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdc,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_adds_epu8_rmkz_256(<32 x i8> %a, <32 x i8>* %ptr_b, i32 %mask) {		define <32 x i8> @test_mask_adds_epu8_rmkz_256(<32 x i8> %a, <32 x i8>* %ptr_b, i32 %mask) {
; CHECK-LABEL: test_mask_adds_epu8_rmkz_256:		; CHECK-LABEL: test_mask_adds_epu8_rmkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpaddusb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdc,0x07]		; CHECK-NEXT: vpaddusb (%rdi), %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdc,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

declare <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.paddus.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <16 x i8> @test_mask_subs_epu8_rr_128(<16 x i8> %a, <16 x i8> %b) {		define <16 x i8> @test_mask_subs_epu8_rr_128(<16 x i8> %a, <16 x i8> %b) {
; CHECK-LABEL: test_mask_subs_epu8_rr_128:		; CHECK-LABEL: test_mask_subs_epu8_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd8,0xc1]		; CHECK-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd8,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epu8_rrk_128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_subs_epu8_rrk_128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epu8_rrk_128:		; CHECK-LABEL: test_mask_subs_epu8_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubusb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd8,0xd1]		; CHECK-NEXT: vpsubusb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd8,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epu8_rrkz_128(<16 x i8> %a, <16 x i8> %b, i16 %mask) {		define <16 x i8> @test_mask_subs_epu8_rrkz_128(<16 x i8> %a, <16 x i8> %b, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epu8_rrkz_128:		; CHECK-LABEL: test_mask_subs_epu8_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd8,0xc1]		; CHECK-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd8,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epu8_rm_128(<16 x i8> %a, <16 x i8>* %ptr_b) {		define <16 x i8> @test_mask_subs_epu8_rm_128(<16 x i8> %a, <16 x i8>* %ptr_b) {
; CHECK-LABEL: test_mask_subs_epu8_rm_128:		; CHECK-LABEL: test_mask_subs_epu8_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubusb (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd8,0x07]		; CHECK-NEXT: vpsubusb (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 -1)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epu8_rmk_128(<16 x i8> %a, <16 x i8>* %ptr_b, <16 x i8> %passThru, i16 %mask) {		define <16 x i8> @test_mask_subs_epu8_rmk_128(<16 x i8> %a, <16 x i8>* %ptr_b, <16 x i8> %passThru, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epu8_rmk_128:		; CHECK-LABEL: test_mask_subs_epu8_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubusb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd8,0x0f]		; CHECK-NEXT: vpsubusb (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd8,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> %passThru, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <16 x i8> @test_mask_subs_epu8_rmkz_128(<16 x i8> %a, <16 x i8>* %ptr_b, i16 %mask) {		define <16 x i8> @test_mask_subs_epu8_rmkz_128(<16 x i8> %a, <16 x i8>* %ptr_b, i16 %mask) {
; CHECK-LABEL: test_mask_subs_epu8_rmkz_128:		; CHECK-LABEL: test_mask_subs_epu8_rmkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubusb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd8,0x07]		; CHECK-NEXT: vpsubusb (%rdi), %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <16 x i8>, <16 x i8>* %ptr_b		%b = load <16 x i8>, <16 x i8>* %ptr_b
%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)		%res = call <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8> %a, <16 x i8> %b, <16 x i8> zeroinitializer, i16 %mask)
ret <16 x i8> %res		ret <16 x i8> %res
}		}

declare <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.psubus.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <32 x i8> @test_mask_subs_epu8_rr_256(<32 x i8> %a, <32 x i8> %b) {		define <32 x i8> @test_mask_subs_epu8_rr_256(<32 x i8> %a, <32 x i8> %b) {
; CHECK-LABEL: test_mask_subs_epu8_rr_256:		; CHECK-LABEL: test_mask_subs_epu8_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubusb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd8,0xc1]		; CHECK-NEXT: vpsubusb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd8,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epu8_rrk_256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_subs_epu8_rrk_256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_subs_epu8_rrk_256:		; CHECK-LABEL: test_mask_subs_epu8_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpsubusb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd8,0xd1]		; CHECK-NEXT: vpsubusb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd8,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epu8_rrkz_256(<32 x i8> %a, <32 x i8> %b, i32 %mask) {		define <32 x i8> @test_mask_subs_epu8_rrkz_256(<32 x i8> %a, <32 x i8> %b, i32 %mask) {
; CHECK-LABEL: test_mask_subs_epu8_rrkz_256:		; CHECK-LABEL: test_mask_subs_epu8_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpsubusb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd8,0xc1]		; CHECK-NEXT: vpsubusb %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd8,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epu8_rm_256(<32 x i8> %a, <32 x i8>* %ptr_b) {		define <32 x i8> @test_mask_subs_epu8_rm_256(<32 x i8> %a, <32 x i8>* %ptr_b) {
; CHECK-LABEL: test_mask_subs_epu8_rm_256:		; CHECK-LABEL: test_mask_subs_epu8_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubusb (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xd8,0x07]		; CHECK-NEXT: vpsubusb (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd8,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> zeroinitializer, i32 -1)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epu8_rmk_256(<32 x i8> %a, <32 x i8>* %ptr_b, <32 x i8> %passThru, i32 %mask) {		define <32 x i8> @test_mask_subs_epu8_rmk_256(<32 x i8> %a, <32 x i8>* %ptr_b, <32 x i8> %passThru, i32 %mask) {
; CHECK-LABEL: test_mask_subs_epu8_rmk_256:		; CHECK-LABEL: test_mask_subs_epu8_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpsubusb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd8,0x0f]		; CHECK-NEXT: vpsubusb (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd8,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <32 x i8>, <32 x i8>* %ptr_b		%b = load <32 x i8>, <32 x i8>* %ptr_b
%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)		%res = call <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8> %a, <32 x i8> %b, <32 x i8> %passThru, i32 %mask)
ret <32 x i8> %res		ret <32 x i8> %res
}		}

define <32 x i8> @test_mask_subs_epu8_rmkz_256(<32 x i8> %a, <32 x i8>* %ptr_b, i32 %mask) {		define <32 x i8> @test_mask_subs_epu8_rmkz_256(<32 x i8> %a, <32 x i8>* %ptr_b, i32 %mask) {
; CHECK-LABEL: test_mask_subs_epu8_rmkz_256:		; CHECK-LABEL: test_mask_subs_epu8_rmkz_256:
Show All 9 Lines
declare <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.psubus.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

declare <8 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_vpermt2var_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_vpermt2var_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_hi_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_hi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd9]
; CHECK-NEXT: vpermt2w %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x7d,0xda]		; CHECK-NEXT: vpermt2w %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x7d,0xda]
; CHECK-NEXT: vpermt2w %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0x7d,0xca]		; CHECK-NEXT: vpermt2w %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0x7d,0xca]
; CHECK-NEXT: vpaddw %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_maskz_vpermt2var_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_maskz_vpermt2var_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vpermt2var_hi_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_vpermt2var_hi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd9]
; CHECK-NEXT: vpermt2w %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x7d,0xda]		; CHECK-NEXT: vpermt2w %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x7d,0xda]
; CHECK-NEXT: vpermt2w %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0x7d,0xca]		; CHECK-NEXT: vpermt2w %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0x7d,0xca]
; CHECK-NEXT: vpaddw %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_vpermt2var_hi_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_vpermt2var_hi_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_hi_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_hi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd9]
; CHECK-NEXT: vpermt2w %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x7d,0xda]		; CHECK-NEXT: vpermt2w %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x7d,0xda]
; CHECK-NEXT: vpermt2w %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0x7d,0xca]		; CHECK-NEXT: vpermt2w %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0x7d,0xca]
; CHECK-NEXT: vpaddw %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfd,0xc1]		; CHECK-NEXT: vpaddw %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.vpermt2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_maskz_vpermt2var_hi_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_maskz_vpermt2var_hi_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vpermt2var_hi_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_vpermt2var_hi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd9]
; CHECK-NEXT: vpermt2w %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x7d,0xda]		; CHECK-NEXT: vpermt2w %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x7d,0xda]
; CHECK-NEXT: vpermt2w %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0x7d,0xca]		; CHECK-NEXT: vpermt2w %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0x7d,0xca]
; CHECK-NEXT: vpaddw %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfd,0xc1]		; CHECK-NEXT: vpaddw %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_vpermi2var_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_vpermi2var_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_hi_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_hi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd9]
; CHECK-NEXT: vpermi2w %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x75,0xda]		; CHECK-NEXT: vpermi2w %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x75,0xda]
; CHECK-NEXT: vpermi2w %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0x75,0xca]		; CHECK-NEXT: vpermi2w %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0x75,0xca]
; CHECK-NEXT: vpaddw %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_vpermi2var_hi_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_vpermi2var_hi_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_hi_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_hi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd9]
; CHECK-NEXT: vpermi2w %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x75,0xda]		; CHECK-NEXT: vpermi2w %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x75,0xda]
; CHECK-NEXT: vpermi2w %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0x75,0xca]		; CHECK-NEXT: vpermi2w %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0x75,0xca]
; CHECK-NEXT: vpaddw %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfd,0xc1]		; CHECK-NEXT: vpaddw %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pavg.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.pavg.b.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_pavg_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {		define <16 x i8>@test_int_x86_avx512_mask_pavg_b_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pavg_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pavg_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpavgb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe0,0xd1]		; CHECK-NEXT: vpavgb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe0,0xd1]
; CHECK-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe0,0xc1]		; CHECK-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe0,0xc1]
; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.pavg.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)		%res = call <16 x i8> @llvm.x86.avx512.mask.pavg.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pavg.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pavg.b.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.pavg.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.pavg.b.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_pavg_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {		define <32 x i8>@test_int_x86_avx512_mask_pavg_b_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pavg_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pavg_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpavgb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe0,0xd1]		; CHECK-NEXT: vpavgb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe0,0xd1]
; CHECK-NEXT: vpavgb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe0,0xc1]		; CHECK-NEXT: vpavgb %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe0,0xc1]
; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc0]		; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.pavg.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)		%res = call <32 x i8> @llvm.x86.avx512.mask.pavg.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.pavg.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.pavg.b.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pavg.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pavg.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pavg_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pavg_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pavg_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pavg_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpavgw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe3,0xd1]		; CHECK-NEXT: vpavgw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe3,0xd1]
; CHECK-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe3,0xc1]		; CHECK-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe3,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pavg.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pavg.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pavg.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pavg.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pavg.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pavg.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pavg_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_pavg_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pavg_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pavg_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpavgw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe3,0xd1]		; CHECK-NEXT: vpavgw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe3,0xd1]
; CHECK-NEXT: vpavgw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe3,0xc1]		; CHECK-NEXT: vpavgw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe3,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pavg.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.pavg.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pavg.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pavg.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pabs.b.128(<16 x i8>, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.pabs.b.128(<16 x i8>, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_pabs_b_128(<16 x i8> %x0, <16 x i8> %x1, i16 %x2) {		define <16 x i8>@test_int_x86_avx512_mask_pabs_b_128(<16 x i8> %x0, <16 x i8> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pabs_b_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pabs_b_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpabsb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x1c,0xc8]		; CHECK-NEXT: vpabsb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x1c,0xc8]
; CHECK-NEXT: vpabsb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1c,0xc0]		; CHECK-NEXT: vpabsb %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1c,0xc0]
; CHECK-NEXT: vpaddb %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.pabs.b.128(<16 x i8> %x0, <16 x i8> %x1, i16 %x2)		%res = call <16 x i8> @llvm.x86.avx512.mask.pabs.b.128(<16 x i8> %x0, <16 x i8> %x1, i16 %x2)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pabs.b.128(<16 x i8> %x0, <16 x i8> %x1, i16 -1)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pabs.b.128(<16 x i8> %x0, <16 x i8> %x1, i16 -1)
%res2 = add <16 x i8> %res, %res1		%res2 = add <16 x i8> %res, %res1
ret <16 x i8> %res2		ret <16 x i8> %res2
}		}

declare <32 x i8> @llvm.x86.avx512.mask.pabs.b.256(<32 x i8>, <32 x i8>, i32)		declare <32 x i8> @llvm.x86.avx512.mask.pabs.b.256(<32 x i8>, <32 x i8>, i32)

define <32 x i8>@test_int_x86_avx512_mask_pabs_b_256(<32 x i8> %x0, <32 x i8> %x1, i32 %x2) {		define <32 x i8>@test_int_x86_avx512_mask_pabs_b_256(<32 x i8> %x0, <32 x i8> %x1, i32 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pabs_b_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pabs_b_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]		; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
; CHECK-NEXT: vpabsb %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x1c,0xc8]		; CHECK-NEXT: vpabsb %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x1c,0xc8]
; CHECK-NEXT: vpabsb %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x1c,0xc0]		; CHECK-NEXT: vpabsb %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x1c,0xc0]
; CHECK-NEXT: vpaddb %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfc,0xc0]		; CHECK-NEXT: vpaddb %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.pabs.b.256(<32 x i8> %x0, <32 x i8> %x1, i32 %x2)		%res = call <32 x i8> @llvm.x86.avx512.mask.pabs.b.256(<32 x i8> %x0, <32 x i8> %x1, i32 %x2)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.pabs.b.256(<32 x i8> %x0, <32 x i8> %x1, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.pabs.b.256(<32 x i8> %x0, <32 x i8> %x1, i32 -1)
%res2 = add <32 x i8> %res, %res1		%res2 = add <32 x i8> %res, %res1
ret <32 x i8> %res2		ret <32 x i8> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pabs.w.128(<8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pabs.w.128(<8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pabs_w_128(<8 x i16> %x0, <8 x i16> %x1, i8 %x2) {		define <8 x i16>@test_int_x86_avx512_mask_pabs_w_128(<8 x i16> %x0, <8 x i16> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pabs_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pabs_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpabsw %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x1d,0xc8]		; CHECK-NEXT: vpabsw %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x1d,0xc8]
; CHECK-NEXT: vpabsw %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1d,0xc0]		; CHECK-NEXT: vpabsw %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1d,0xc0]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pabs.w.128(<8 x i16> %x0, <8 x i16> %x1, i8 %x2)		%res = call <8 x i16> @llvm.x86.avx512.mask.pabs.w.128(<8 x i16> %x0, <8 x i16> %x1, i8 %x2)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pabs.w.128(<8 x i16> %x0, <8 x i16> %x1, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pabs.w.128(<8 x i16> %x0, <8 x i16> %x1, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pabs.w.256(<16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pabs.w.256(<16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pabs_w_256(<16 x i16> %x0, <16 x i16> %x1, i16 %x2) {		define <16 x i16>@test_int_x86_avx512_mask_pabs_w_256(<16 x i16> %x0, <16 x i16> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pabs_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pabs_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpabsw %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x1d,0xc8]		; CHECK-NEXT: vpabsw %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x1d,0xc8]
; CHECK-NEXT: vpabsw %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x1d,0xc0]		; CHECK-NEXT: vpabsw %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x1d,0xc0]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pabs.w.256(<16 x i16> %x0, <16 x i16> %x1, i16 %x2)		%res = call <16 x i16> @llvm.x86.avx512.mask.pabs.w.256(<16 x i16> %x0, <16 x i16> %x1, i16 %x2)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pabs.w.256(<16 x i16> %x0, <16 x i16> %x1, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pabs.w.256(<16 x i16> %x0, <16 x i16> %x1, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmulhu.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmulhu.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmulhu_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pmulhu_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmulhu_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmulhu_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmulhuw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe4,0xd1]		; CHECK-NEXT: vpmulhuw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe4,0xd1]
; CHECK-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe4,0xc1]		; CHECK-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe4,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmulhu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmulhu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmulhu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmulhu.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmulhu.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmulhu.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmulhu_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_pmulhu_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmulhu_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmulhu_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmulhuw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe4,0xd1]		; CHECK-NEXT: vpmulhuw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe4,0xd1]
; CHECK-NEXT: vpmulhuw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe4,0xc1]		; CHECK-NEXT: vpmulhuw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe4,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmulhu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmulhu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmulhu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmulhu.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmulh.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmulh.w.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmulh_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pmulh_w_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmulh_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmulh_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmulhw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe5,0xd1]		; CHECK-NEXT: vpmulhw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe5,0xd1]
; CHECK-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe5,0xc1]		; CHECK-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe5,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmulh.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmulh.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmulh.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmulh.w.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmulh.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmulh.w.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmulh_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_pmulh_w_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmulh_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmulh_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmulhw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe5,0xd1]		; CHECK-NEXT: vpmulhw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe5,0xd1]
; CHECK-NEXT: vpmulhw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xe5,0xc1]		; CHECK-NEXT: vpmulhw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe5,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmulh.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmulh.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmulh.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmulh.w.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmulhr_sw_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pmulhr_sw_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmulhr_sw_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmulhr_sw_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x0b,0xd1]		; CHECK-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x0b,0xd1]
; CHECK-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x0b,0xc1]		; CHECK-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x0b,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmulhr_sw_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_pmulhr_sw_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmulhr_sw_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmulhr_sw_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmulhrsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x0b,0xd1]		; CHECK-NEXT: vpmulhrsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x0b,0xd1]
; CHECK-NEXT: vpmulhrsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x0b,0xc1]		; CHECK-NEXT: vpmulhrsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x0b,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmul.hr.sw.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pmov.wb.128(<8 x i16>, <16 x i8>, i8)		declare <16 x i8> @llvm.x86.avx512.mask.pmov.wb.128(<8 x i16>, <16 x i8>, i8)

define <16 x i8>@test_int_x86_avx512_mask_pmov_wb_128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2) {		define <16 x i8>@test_int_x86_avx512_mask_pmov_wb_128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmov_wb_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmov_wb_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovwb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x30,0xc1]		; CHECK-NEXT: vpmovwb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x30,0xc1]
; CHECK-NEXT: vpmovwb %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x30,0xc2]		; CHECK-NEXT: vpmovwb %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x30,0xc2]
; CHECK-NEXT: vpmovwb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x30,0xc0]		; CHECK-NEXT: vpmovwb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x30,0xc0]
; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc1]		; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc1]
; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc2]		; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2)
%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.128(<8 x i16> %x0, <16 x i8> zeroinitializer, i8 %x2)		%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.128(<8 x i16> %x0, <16 x i8> zeroinitializer, i8 %x2)
%res3 = add <16 x i8> %res0, %res1		%res3 = add <16 x i8> %res0, %res1
%res4 = add <16 x i8> %res3, %res2		%res4 = add <16 x i8> %res3, %res2
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}
Show All 16 Lines

define <16 x i8>@test_int_x86_avx512_mask_pmovs_wb_128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2) {		define <16 x i8>@test_int_x86_avx512_mask_pmovs_wb_128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovs_wb_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovs_wb_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovswb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x20,0xc1]		; CHECK-NEXT: vpmovswb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x20,0xc1]
; CHECK-NEXT: vpmovswb %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x20,0xc2]		; CHECK-NEXT: vpmovswb %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x20,0xc2]
; CHECK-NEXT: vpmovswb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x20,0xc0]		; CHECK-NEXT: vpmovswb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x20,0xc0]
; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc1]		; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc1]
; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc2]		; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2)
%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.128(<8 x i16> %x0, <16 x i8> zeroinitializer, i8 %x2)		%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.128(<8 x i16> %x0, <16 x i8> zeroinitializer, i8 %x2)
%res3 = add <16 x i8> %res0, %res1		%res3 = add <16 x i8> %res0, %res1
%res4 = add <16 x i8> %res3, %res2		%res4 = add <16 x i8> %res3, %res2
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}
Show All 16 Lines

define <16 x i8>@test_int_x86_avx512_mask_pmovus_wb_128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2) {		define <16 x i8>@test_int_x86_avx512_mask_pmovus_wb_128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovus_wb_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovus_wb_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovuswb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x10,0xc1]		; CHECK-NEXT: vpmovuswb %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x10,0xc1]
; CHECK-NEXT: vpmovuswb %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x10,0xc2]		; CHECK-NEXT: vpmovuswb %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x10,0xc2]
; CHECK-NEXT: vpmovuswb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x10,0xc0]		; CHECK-NEXT: vpmovuswb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x10,0xc0]
; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc1]		; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc1]
; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc2]		; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.128(<8 x i16> %x0, <16 x i8> %x1, i8 %x2)
%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.128(<8 x i16> %x0, <16 x i8> zeroinitializer, i8 %x2)		%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.128(<8 x i16> %x0, <16 x i8> zeroinitializer, i8 %x2)
%res3 = add <16 x i8> %res0, %res1		%res3 = add <16 x i8> %res0, %res1
%res4 = add <16 x i8> %res3, %res2		%res4 = add <16 x i8> %res3, %res2
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}
Show All 16 Lines

define <16 x i8>@test_int_x86_avx512_mask_pmov_wb_256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2) {		define <16 x i8>@test_int_x86_avx512_mask_pmov_wb_256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmov_wb_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmov_wb_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovwb %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x30,0xc1]		; CHECK-NEXT: vpmovwb %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x30,0xc1]
; CHECK-NEXT: vpmovwb %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x30,0xc2]		; CHECK-NEXT: vpmovwb %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x30,0xc2]
; CHECK-NEXT: vpmovwb %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x30,0xc0]		; CHECK-NEXT: vpmovwb %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x30,0xc0]
; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc1]		; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc1]
; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc2]		; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2)
%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.256(<16 x i16> %x0, <16 x i8> zeroinitializer, i16 %x2)		%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmov.wb.256(<16 x i16> %x0, <16 x i8> zeroinitializer, i16 %x2)
%res3 = add <16 x i8> %res0, %res1		%res3 = add <16 x i8> %res0, %res1
%res4 = add <16 x i8> %res3, %res2		%res4 = add <16 x i8> %res3, %res2
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}
Show All 16 Lines

define <16 x i8>@test_int_x86_avx512_mask_pmovs_wb_256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2) {		define <16 x i8>@test_int_x86_avx512_mask_pmovs_wb_256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovs_wb_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovs_wb_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovswb %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x20,0xc1]		; CHECK-NEXT: vpmovswb %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x20,0xc1]
; CHECK-NEXT: vpmovswb %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x20,0xc2]		; CHECK-NEXT: vpmovswb %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x20,0xc2]
; CHECK-NEXT: vpmovswb %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x20,0xc0]		; CHECK-NEXT: vpmovswb %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x20,0xc0]
; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc1]		; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc1]
; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc2]		; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2)
%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.256(<16 x i16> %x0, <16 x i8> zeroinitializer, i16 %x2)		%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmovs.wb.256(<16 x i16> %x0, <16 x i8> zeroinitializer, i16 %x2)
%res3 = add <16 x i8> %res0, %res1		%res3 = add <16 x i8> %res0, %res1
%res4 = add <16 x i8> %res3, %res2		%res4 = add <16 x i8> %res3, %res2
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}
Show All 16 Lines

define <16 x i8>@test_int_x86_avx512_mask_pmovus_wb_256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2) {		define <16 x i8>@test_int_x86_avx512_mask_pmovus_wb_256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovus_wb_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovus_wb_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovuswb %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x10,0xc1]		; CHECK-NEXT: vpmovuswb %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x10,0xc1]
; CHECK-NEXT: vpmovuswb %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x10,0xc2]		; CHECK-NEXT: vpmovuswb %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x10,0xc2]
; CHECK-NEXT: vpmovuswb %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x10,0xc0]		; CHECK-NEXT: vpmovuswb %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x10,0xc0]
; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc1]		; CHECK-NEXT: vpaddb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc1]
; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfc,0xc2]		; CHECK-NEXT: vpaddb %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfc,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.256(<16 x i16> %x0, <16 x i8> %x1, i16 %x2)
%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.256(<16 x i16> %x0, <16 x i8> zeroinitializer, i16 %x2)		%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmovus.wb.256(<16 x i16> %x0, <16 x i8> zeroinitializer, i16 %x2)
%res3 = add <16 x i8> %res0, %res1		%res3 = add <16 x i8> %res0, %res1
%res4 = add <16 x i8> %res3, %res2		%res4 = add <16 x i8> %res3, %res2
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}
Show All 14 Lines

declare <4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128(<8 x i16>, <8 x i16>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128(<8 x i16>, <8 x i16>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pmaddw_d_128(<8 x i16> %x0, <8 x i16> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_pmaddw_d_128(<8 x i16> %x0, <8 x i16> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaddw_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaddw_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaddwd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf5,0xd1]		; CHECK-NEXT: vpmaddwd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf5,0xd1]
; CHECK-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf5,0xc1]		; CHECK-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf5,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128(<8 x i16> %x0, <8 x i16> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128(<8 x i16> %x0, <8 x i16> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128(<8 x i16> %x0, <8 x i16> %x1, <4 x i32> %x2, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128(<8 x i16> %x0, <8 x i16> %x1, <4 x i32> %x2, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256(<16 x i16>, <16 x i16>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256(<16 x i16>, <16 x i16>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pmaddw_d_256(<16 x i16> %x0, <16 x i16> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_pmaddw_d_256(<16 x i16> %x0, <16 x i16> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaddw_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaddw_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaddwd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf5,0xd1]		; CHECK-NEXT: vpmaddwd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf5,0xd1]
; CHECK-NEXT: vpmaddwd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xf5,0xc1]		; CHECK-NEXT: vpmaddwd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf5,0xc1]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256(<16 x i16> %x0, <16 x i16> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256(<16 x i16> %x0, <16 x i16> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256(<16 x i16> %x0, <16 x i16> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256(<16 x i16> %x0, <16 x i16> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128(<16 x i8>, <16 x i8>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128(<16 x i8>, <16 x i8>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pmaddubs_w_128(<16 x i8> %x0, <16 x i8> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_pmaddubs_w_128(<16 x i8> %x0, <16 x i8> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaddubs_w_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaddubs_w_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x04,0xd1]		; CHECK-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x04,0xd1]
; CHECK-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x04,0xc1]		; CHECK-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x04,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128(<16 x i8> %x0, <16 x i8> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128(<16 x i8> %x0, <16 x i8> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128(<16 x i8> %x0, <16 x i8> %x1, <8 x i16> %x2, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128(<16 x i8> %x0, <16 x i8> %x1, <8 x i16> %x2, i8 -1)
%res2 = add <8 x i16> %res, %res1		%res2 = add <8 x i16> %res, %res1
ret <8 x i16> %res2		ret <8 x i16> %res2
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256(<32 x i8>, <32 x i8>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256(<32 x i8>, <32 x i8>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pmaddubs_w_256(<32 x i8> %x0, <32 x i8> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_pmaddubs_w_256(<32 x i8> %x0, <32 x i8> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaddubs_w_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaddubs_w_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaddubsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x04,0xd1]		; CHECK-NEXT: vpmaddubsw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x04,0xd1]
; CHECK-NEXT: vpmaddubsw %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x04,0xc1]		; CHECK-NEXT: vpmaddubsw %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x04,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256(<32 x i8> %x0, <32 x i8> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256(<32 x i8> %x0, <32 x i8> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256(<32 x i8> %x0, <32 x i8> %x1, <16 x i16> %x2, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256(<32 x i8> %x0, <32 x i8> %x1, <16 x i16> %x2, i16 -1)
%res2 = add <16 x i16> %res, %res1		%res2 = add <16 x i16> %res, %res1
ret <16 x i16> %res2		ret <16 x i16> %res2
}		}

declare <8 x i16> @llvm.x86.avx512.mask.dbpsadbw.128(<16 x i8>, <16 x i8>, i32, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.dbpsadbw.128(<16 x i8>, <16 x i8>, i32, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_dbpsadbw_128(<16 x i8> %x0, <16 x i8> %x1, <8 x i16> %x3, i8 %x4) {		define <8 x i16>@test_int_x86_avx512_mask_dbpsadbw_128(<16 x i8> %x0, <16 x i8> %x1, <8 x i16> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_dbpsadbw_128:		; CHECK-LABEL: test_int_x86_avx512_mask_dbpsadbw_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vdbpsadbw $2, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x42,0xd1,0x02]		; CHECK-NEXT: vdbpsadbw $2, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x42,0xd1,0x02]
; CHECK-NEXT: vdbpsadbw $2, %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x42,0xd9,0x02]		; CHECK-NEXT: vdbpsadbw $2, %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x42,0xd9,0x02]
; CHECK-NEXT: vdbpsadbw $2, %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x42,0xc1,0x02]		; CHECK-NEXT: vdbpsadbw $2, %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x42,0xc1,0x02]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xcb]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xcb]
; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc1]		; CHECK-NEXT: vpaddw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.dbpsadbw.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <8 x i16> %x3, i8 %x4)		%res = call <8 x i16> @llvm.x86.avx512.mask.dbpsadbw.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <8 x i16> %x3, i8 %x4)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.dbpsadbw.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <8 x i16> zeroinitializer, i8 %x4)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.dbpsadbw.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <8 x i16> zeroinitializer, i8 %x4)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.dbpsadbw.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <8 x i16> %x3, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.dbpsadbw.128(<16 x i8> %x0, <16 x i8> %x1, i32 2, <8 x i16> %x3, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res2, %res3		%res4 = add <8 x i16> %res2, %res3
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.dbpsadbw.256(<32 x i8>, <32 x i8>, i32, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.dbpsadbw.256(<32 x i8>, <32 x i8>, i32, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_dbpsadbw_256(<32 x i8> %x0, <32 x i8> %x1, <16 x i16> %x3, i16 %x4) {		define <16 x i16>@test_int_x86_avx512_mask_dbpsadbw_256(<32 x i8> %x0, <32 x i8> %x1, <16 x i16> %x3, i16 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_dbpsadbw_256:		; CHECK-LABEL: test_int_x86_avx512_mask_dbpsadbw_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vdbpsadbw $2, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x42,0xd1,0x02]		; CHECK-NEXT: vdbpsadbw $2, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x42,0xd1,0x02]
; CHECK-NEXT: vdbpsadbw $2, %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x42,0xd9,0x02]		; CHECK-NEXT: vdbpsadbw $2, %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x42,0xd9,0x02]
; CHECK-NEXT: vdbpsadbw $2, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x42,0xc1,0x02]		; CHECK-NEXT: vdbpsadbw $2, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x42,0xc1,0x02]
; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xcb]		; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xcb]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.dbpsadbw.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <16 x i16> %x3, i16 %x4)		%res = call <16 x i16> @llvm.x86.avx512.mask.dbpsadbw.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <16 x i16> %x3, i16 %x4)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.dbpsadbw.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <16 x i16> zeroinitializer, i16 %x4)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.dbpsadbw.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <16 x i16> zeroinitializer, i16 %x4)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.dbpsadbw.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <16 x i16> %x3, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.dbpsadbw.256(<32 x i8> %x0, <32 x i8> %x1, i32 2, <16 x i16> %x3, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}
▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines

define <16 x i16>@test_int_x86_avx512_mask_psrlv16_hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psrlv16_hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrlv16_hi:		; CHECK-LABEL: test_int_x86_avx512_mask_psrlv16_hi:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlvw %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x10,0xd9]		; CHECK-NEXT: vpsrlvw %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x10,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlvw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x10,0xd1]		; CHECK-NEXT: vpsrlvw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x10,0xd1]
; CHECK-NEXT: vpsrlvw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x10,0xc1]		; CHECK-NEXT: vpsrlvw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x10,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc3]		; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psrlv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psrlv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psrlv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psrlv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psrlv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psrlv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psrlv8_hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psrlv8_hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrlv8_hi:		; CHECK-LABEL: test_int_x86_avx512_mask_psrlv8_hi:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlvw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x10,0xd9]		; CHECK-NEXT: vpsrlvw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x10,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlvw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x10,0xd1]		; CHECK-NEXT: vpsrlvw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x10,0xd1]
; CHECK-NEXT: vpsrlvw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x10,0xc1]		; CHECK-NEXT: vpsrlvw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x10,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrlv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psrav16_hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psrav16_hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrav16_hi:		; CHECK-LABEL: test_int_x86_avx512_mask_psrav16_hi:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsravw %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x11,0xd9]		; CHECK-NEXT: vpsravw %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x11,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsravw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x11,0xd1]		; CHECK-NEXT: vpsravw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x11,0xd1]
; CHECK-NEXT: vpsravw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x11,0xc1]		; CHECK-NEXT: vpsravw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x11,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc3]		; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psrav16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psrav8.hi(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psrav8.hi(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psrav8_hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psrav8_hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrav8_hi:		; CHECK-LABEL: test_int_x86_avx512_mask_psrav8_hi:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsravw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x11,0xd9]		; CHECK-NEXT: vpsravw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x11,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsravw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x11,0xd1]		; CHECK-NEXT: vpsravw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x11,0xd1]
; CHECK-NEXT: vpsravw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x11,0xc1]		; CHECK-NEXT: vpsravw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x11,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psrav8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psrav8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrav8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psrav8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrav8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psrav8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.psllv16.hi(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.psllv16.hi(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_psllv16_hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_psllv16_hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psllv16_hi:		; CHECK-LABEL: test_int_x86_avx512_mask_psllv16_hi:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllvw %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x12,0xd9]		; CHECK-NEXT: vpsllvw %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x12,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllvw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x12,0xd1]		; CHECK-NEXT: vpsllvw %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x12,0xd1]
; CHECK-NEXT: vpsllvw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x12,0xc1]		; CHECK-NEXT: vpsllvw %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x12,0xc1]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfd,0xc3]		; CHECK-NEXT: vpaddw %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.psllv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.psllv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.psllv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.psllv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.psllv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.psllv16.hi(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_psllv8_hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_psllv8_hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psllv8_hi:		; CHECK-LABEL: test_int_x86_avx512_mask_psllv8_hi:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllvw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x12,0xd9]		; CHECK-NEXT: vpsllvw %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x12,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllvw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x12,0xd1]		; CHECK-NEXT: vpsllvw %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x12,0xd1]
; CHECK-NEXT: vpsllvw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x12,0xc1]		; CHECK-NEXT: vpsllvw %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x12,0xc1]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfd,0xc3]		; CHECK-NEXT: vpaddw %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfd,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.psllv8.hi(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.permvar.hi.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.permvar.hi.128(<8 x i16>, <8 x i16>, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_permvar_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {		define <8 x i16>@test_int_x86_avx512_mask_permvar_hi_128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_permvar_hi_128:		; CHECK-LABEL: test_int_x86_avx512_mask_permvar_hi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermw %xmm0, %xmm1, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xf5,0x09,0x8d,0xd0]		; CHECK-NEXT: vpermw %xmm0, %xmm1, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xf5,0x09,0x8d,0xd0]
; CHECK-NEXT: vpermw %xmm0, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0x89,0x8d,0xd8]		; CHECK-NEXT: vpermw %xmm0, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0x89,0x8d,0xd8]
; CHECK-NEXT: vpermw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf2,0xf5,0x08,0x8d,0xc0]		; CHECK-NEXT: vpermw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf2,0xf5,0x08,0x8d,0xc0]
; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xcb]		; CHECK-NEXT: vpaddw %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xcb]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.permvar.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)		%res = call <8 x i16> @llvm.x86.avx512.mask.permvar.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 %x3)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.permvar.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.permvar.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> zeroinitializer, i8 %x3)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.permvar.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.permvar.hi.128(<8 x i16> %x0, <8 x i16> %x1, <8 x i16> %x2, i8 -1)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res3, %res2		%res4 = add <8 x i16> %res3, %res2
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.permvar.hi.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.permvar.hi.256(<16 x i16>, <16 x i16>, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_permvar_hi_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {		define <16 x i16>@test_int_x86_avx512_mask_permvar_hi_256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_permvar_hi_256:		; CHECK-LABEL: test_int_x86_avx512_mask_permvar_hi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermw %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xf5,0x29,0x8d,0xd0]		; CHECK-NEXT: vpermw %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xf5,0x29,0x8d,0xd0]
; CHECK-NEXT: vpermw %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0xa9,0x8d,0xd8]		; CHECK-NEXT: vpermw %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0xa9,0x8d,0xd8]
; CHECK-NEXT: vpermw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0xf5,0x28,0x8d,0xc0]		; CHECK-NEXT: vpermw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0xf5,0x28,0x8d,0xc0]
; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xcb]		; CHECK-NEXT: vpaddw %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xcb]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.permvar.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)		%res = call <16 x i16> @llvm.x86.avx512.mask.permvar.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 %x3)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.permvar.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.permvar.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> zeroinitializer, i16 %x3)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.permvar.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.permvar.hi.256(<16 x i16> %x0, <16 x i16> %x1, <16 x i16> %x2, i16 -1)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res3, %res2		%res4 = add <16 x i16> %res3, %res2
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}
▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines

define <32 x i8>@test_int_x86_avx512_mask_pbroadcast_b_gpr_256(i8 %x0, <32 x i8> %x1, i32 %mask) {		define <32 x i8>@test_int_x86_avx512_mask_pbroadcast_b_gpr_256(i8 %x0, <32 x i8> %x1, i32 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_b_gpr_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_b_gpr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]		; CHECK-NEXT: kmovd %esi, %k1 ## encoding: [0xc5,0xfb,0x92,0xce]
; CHECK-NEXT: vpbroadcastb %dil, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7a,0xc7]		; CHECK-NEXT: vpbroadcastb %dil, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7a,0xc7]
; CHECK-NEXT: vpbroadcastb %dil, %ymm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7a,0xcf]		; CHECK-NEXT: vpbroadcastb %dil, %ymm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7a,0xcf]
; CHECK-NEXT: vpbroadcastb %dil, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x7a,0xd7]		; CHECK-NEXT: vpbroadcastb %dil, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x7a,0xd7]
; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc0]		; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc0]
; CHECK-NEXT: vpaddb %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfc,0xc0]		; CHECK-NEXT: vpaddb %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <32 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.256(i8 %x0, <32 x i8> %x1, i32 -1)		%res = call <32 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.256(i8 %x0, <32 x i8> %x1, i32 -1)
%res1 = call <32 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.256(i8 %x0, <32 x i8> %x1, i32 %mask)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.256(i8 %x0, <32 x i8> %x1, i32 %mask)
%res2 = call <32 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.256(i8 %x0, <32 x i8> zeroinitializer, i32 %mask)		%res2 = call <32 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.256(i8 %x0, <32 x i8> zeroinitializer, i32 %mask)
%res3 = add <32 x i8> %res, %res1		%res3 = add <32 x i8> %res, %res1
%res4 = add <32 x i8> %res2, %res3		%res4 = add <32 x i8> %res2, %res3
ret <32 x i8> %res4		ret <32 x i8> %res4
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.128(i8, <16 x i8>, i16)		declare <16 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.128(i8, <16 x i8>, i16)

define <16 x i8>@test_int_x86_avx512_mask_pbroadcast_b_gpr_128(i8 %x0, <16 x i8> %x1, i16 %mask) {		define <16 x i8>@test_int_x86_avx512_mask_pbroadcast_b_gpr_128(i8 %x0, <16 x i8> %x1, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_b_gpr_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_b_gpr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpbroadcastb %dil, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7a,0xc7]		; CHECK-NEXT: vpbroadcastb %dil, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7a,0xc7]
; CHECK-NEXT: vpbroadcastb %dil, %xmm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7a,0xcf]		; CHECK-NEXT: vpbroadcastb %dil, %xmm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7a,0xcf]
; CHECK-NEXT: vpbroadcastb %dil, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x7a,0xd7]		; CHECK-NEXT: vpbroadcastb %dil, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x7a,0xd7]
; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
; CHECK-NEXT: vpaddb %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfc,0xc0]		; CHECK-NEXT: vpaddb %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfc,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.128(i8 %x0, <16 x i8> %x1, i16 -1)		%res = call <16 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.128(i8 %x0, <16 x i8> %x1, i16 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.128(i8 %x0, <16 x i8> %x1, i16 %mask)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.128(i8 %x0, <16 x i8> %x1, i16 %mask)
%res2 = call <16 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.128(i8 %x0, <16 x i8> zeroinitializer, i16 %mask)		%res2 = call <16 x i8> @llvm.x86.avx512.mask.pbroadcast.b.gpr.128(i8 %x0, <16 x i8> zeroinitializer, i16 %mask)
%res3 = add <16 x i8> %res, %res1		%res3 = add <16 x i8> %res, %res1
%res4 = add <16 x i8> %res2, %res3		%res4 = add <16 x i8> %res2, %res3
ret <16 x i8> %res4		ret <16 x i8> %res4
}		}

declare <16 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.256(i16, <16 x i16>, i16)		declare <16 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.256(i16, <16 x i16>, i16)

define <16 x i16>@test_int_x86_avx512_mask_pbroadcast_w_gpr_256(i16 %x0, <16 x i16> %x1, i16 %mask) {		define <16 x i16>@test_int_x86_avx512_mask_pbroadcast_w_gpr_256(i16 %x0, <16 x i16> %x1, i16 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_w_gpr_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_w_gpr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpbroadcastw %di, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7b,0xc7]		; CHECK-NEXT: vpbroadcastw %di, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7b,0xc7]
; CHECK-NEXT: vpbroadcastw %di, %ymm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7b,0xcf]		; CHECK-NEXT: vpbroadcastw %di, %ymm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7b,0xcf]
; CHECK-NEXT: vpbroadcastw %di, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x7b,0xd7]		; CHECK-NEXT: vpbroadcastw %di, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x7b,0xd7]
; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfd,0xc0]
; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfd,0xc0]		; CHECK-NEXT: vpaddw %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <16 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.256(i16 %x0, <16 x i16> %x1, i16 -1)		%res = call <16 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.256(i16 %x0, <16 x i16> %x1, i16 -1)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.256(i16 %x0, <16 x i16> %x1, i16 %mask)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.256(i16 %x0, <16 x i16> %x1, i16 %mask)
%res2 = call <16 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.256(i16 %x0, <16 x i16> zeroinitializer, i16 %mask)		%res2 = call <16 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.256(i16 %x0, <16 x i16> zeroinitializer, i16 %mask)
%res3 = add <16 x i16> %res, %res1		%res3 = add <16 x i16> %res, %res1
%res4 = add <16 x i16> %res2, %res3		%res4 = add <16 x i16> %res2, %res3
ret <16 x i16> %res4		ret <16 x i16> %res4
}		}

declare <8 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.128(i16, <8 x i16>, i8)		declare <8 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.128(i16, <8 x i16>, i8)

define <8 x i16>@test_int_x86_avx512_mask_pbroadcast_w_gpr_128(i16 %x0, <8 x i16> %x1, i8 %mask) {		define <8 x i16>@test_int_x86_avx512_mask_pbroadcast_w_gpr_128(i16 %x0, <8 x i16> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_w_gpr_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_w_gpr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpbroadcastw %di, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7b,0xc7]		; CHECK-NEXT: vpbroadcastw %di, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7b,0xc7]
; CHECK-NEXT: vpbroadcastw %di, %xmm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7b,0xcf]		; CHECK-NEXT: vpbroadcastw %di, %xmm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7b,0xcf]
; CHECK-NEXT: vpbroadcastw %di, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x7b,0xd7]		; CHECK-NEXT: vpbroadcastw %di, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x7b,0xd7]
; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfd,0xc0]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.128(i16 %x0, <8 x i16> %x1, i8 -1)		%res = call <8 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.128(i16 %x0, <8 x i16> %x1, i8 -1)
%res1 = call <8 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.128(i16 %x0, <8 x i16> %x1, i8 %mask)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.128(i16 %x0, <8 x i16> %x1, i8 %mask)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.128(i16 %x0, <8 x i16> zeroinitializer, i8 %mask)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.pbroadcast.w.gpr.128(i16 %x0, <8 x i16> zeroinitializer, i8 %mask)
%res3 = add <8 x i16> %res, %res1		%res3 = add <8 x i16> %res, %res1
%res4 = add <8 x i16> %res2, %res3		%res4 = add <8 x i16> %res2, %res3
ret <8 x i16> %res4		ret <8 x i16> %res4
}		}

llvm/trunk/test/CodeGen/X86/avx512bwvl-mov.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -march=x86-64 -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512bw -mattr=+avx512vl --show-mc-encoding\| FileCheck %s			; RUN: llc < %s -march=x86-64 -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512bw -mattr=+avx512vl --show-mc-encoding\| FileCheck %s

	define <32 x i8> @test_256_1(i8 * %addr) {			define <32 x i8> @test_256_1(i8 * %addr) {
	; CHECK-LABEL: test_256_1:			; CHECK-LABEL: test_256_1:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovdqu8 (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7f,0x28,0x6f,0x07]			; CHECK-NEXT: vmovdqu (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <32 x i8>*			%vaddr = bitcast i8* %addr to <32 x i8>*
	%res = load <32 x i8>, <32 x i8>* %vaddr, align 1			%res = load <32 x i8>, <32 x i8>* %vaddr, align 1
	ret <32 x i8>%res			ret <32 x i8>%res
	}			}

	define void @test_256_2(i8 * %addr, <32 x i8> %data) {			define void @test_256_2(i8 * %addr, <32 x i8> %data) {
	; CHECK-LABEL: test_256_2:			; CHECK-LABEL: test_256_2:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovdqu8 %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7f,0x28,0x7f,0x07]			; CHECK-NEXT: vmovdqu %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x7f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <32 x i8>*			%vaddr = bitcast i8* %addr to <32 x i8>*
	store <32 x i8>%data, <32 x i8>* %vaddr, align 1			store <32 x i8>%data, <32 x i8>* %vaddr, align 1
	ret void			ret void
	}			}

	define <32 x i8> @test_256_3(i8 * %addr, <32 x i8> %old, <32 x i8> %mask1) {			define <32 x i8> @test_256_3(i8 * %addr, <32 x i8> %old, <32 x i8> %mask1) {
	; CHECK-LABEL: test_256_3:			; CHECK-LABEL: test_256_3:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqb %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0x75,0x28,0x3f,0xca,0x04]			; CHECK-NEXT: vpcmpneqb %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0x75,0x28,0x3f,0xca,0x04]
	; CHECK-NEXT: vpblendmb (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x66,0x07]			; CHECK-NEXT: vpblendmb (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x66,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <32 x i8> %mask1, zeroinitializer			%mask = icmp ne <32 x i8> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <32 x i8>*			%vaddr = bitcast i8* %addr to <32 x i8>*
	%r = load <32 x i8>, <32 x i8>* %vaddr, align 1			%r = load <32 x i8>, <32 x i8>* %vaddr, align 1
	%res = select <32 x i1> %mask, <32 x i8> %r, <32 x i8> %old			%res = select <32 x i1> %mask, <32 x i8> %r, <32 x i8> %old
	ret <32 x i8>%res			ret <32 x i8>%res
	}			}

	define <32 x i8> @test_256_4(i8 * %addr, <32 x i8> %mask1) {			define <32 x i8> @test_256_4(i8 * %addr, <32 x i8> %mask1) {
	; CHECK-LABEL: test_256_4:			; CHECK-LABEL: test_256_4:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqb %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqb %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x28,0x3f,0xc9,0x04]
	; CHECK-NEXT: vmovdqu8 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0xa9,0x6f,0x07]			; CHECK-NEXT: vmovdqu8 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0xa9,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <32 x i8> %mask1, zeroinitializer			%mask = icmp ne <32 x i8> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <32 x i8>*			%vaddr = bitcast i8* %addr to <32 x i8>*
	%r = load <32 x i8>, <32 x i8>* %vaddr, align 1			%r = load <32 x i8>, <32 x i8>* %vaddr, align 1
	%res = select <32 x i1> %mask, <32 x i8> %r, <32 x i8> zeroinitializer			%res = select <32 x i1> %mask, <32 x i8> %r, <32 x i8> zeroinitializer
	ret <32 x i8>%res			ret <32 x i8>%res
	}			}

	define <16 x i16> @test_256_5(i8 * %addr) {			define <16 x i16> @test_256_5(i8 * %addr) {
	; CHECK-LABEL: test_256_5:			; CHECK-LABEL: test_256_5:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovdqu16 (%rdi), %ymm0 ## encoding: [0x62,0xf1,0xff,0x28,0x6f,0x07]			; CHECK-NEXT: vmovdqu (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <16 x i16>*			%vaddr = bitcast i8* %addr to <16 x i16>*
	%res = load <16 x i16>, <16 x i16>* %vaddr, align 1			%res = load <16 x i16>, <16 x i16>* %vaddr, align 1
	ret <16 x i16>%res			ret <16 x i16>%res
	}			}

	define void @test_256_6(i8 * %addr, <16 x i16> %data) {			define void @test_256_6(i8 * %addr, <16 x i16> %data) {
	; CHECK-LABEL: test_256_6:			; CHECK-LABEL: test_256_6:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovdqu16 %ymm0, (%rdi) ## encoding: [0x62,0xf1,0xff,0x28,0x7f,0x07]			; CHECK-NEXT: vmovdqu %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x7f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <16 x i16>*			%vaddr = bitcast i8* %addr to <16 x i16>*
	store <16 x i16>%data, <16 x i16>* %vaddr, align 1			store <16 x i16>%data, <16 x i16>* %vaddr, align 1
	ret void			ret void
	}			}

	define <16 x i16> @test_256_7(i8 * %addr, <16 x i16> %old, <16 x i16> %mask1) {			define <16 x i16> @test_256_7(i8 * %addr, <16 x i16> %old, <16 x i16> %mask1) {
	; CHECK-LABEL: test_256_7:			; CHECK-LABEL: test_256_7:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqw %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x3f,0xca,0x04]			; CHECK-NEXT: vpcmpneqw %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x3f,0xca,0x04]
	; CHECK-NEXT: vpblendmw (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x66,0x07]			; CHECK-NEXT: vpblendmw (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x66,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <16 x i16> %mask1, zeroinitializer			%mask = icmp ne <16 x i16> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <16 x i16>*			%vaddr = bitcast i8* %addr to <16 x i16>*
	%r = load <16 x i16>, <16 x i16>* %vaddr, align 1			%r = load <16 x i16>, <16 x i16>* %vaddr, align 1
	%res = select <16 x i1> %mask, <16 x i16> %r, <16 x i16> %old			%res = select <16 x i1> %mask, <16 x i16> %r, <16 x i16> %old
	ret <16 x i16>%res			ret <16 x i16>%res
	}			}

	define <16 x i16> @test_256_8(i8 * %addr, <16 x i16> %mask1) {			define <16 x i16> @test_256_8(i8 * %addr, <16 x i16> %mask1) {
	; CHECK-LABEL: test_256_8:			; CHECK-LABEL: test_256_8:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqw %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqw %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x3f,0xc9,0x04]
	; CHECK-NEXT: vmovdqu16 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0xa9,0x6f,0x07]			; CHECK-NEXT: vmovdqu16 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0xa9,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <16 x i16> %mask1, zeroinitializer			%mask = icmp ne <16 x i16> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <16 x i16>*			%vaddr = bitcast i8* %addr to <16 x i16>*
	%r = load <16 x i16>, <16 x i16>* %vaddr, align 1			%r = load <16 x i16>, <16 x i16>* %vaddr, align 1
	%res = select <16 x i1> %mask, <16 x i16> %r, <16 x i16> zeroinitializer			%res = select <16 x i1> %mask, <16 x i16> %r, <16 x i16> zeroinitializer
	ret <16 x i16>%res			ret <16 x i16>%res
	}			}

	define <16 x i8> @test_128_1(i8 * %addr) {			define <16 x i8> @test_128_1(i8 * %addr) {
	; CHECK-LABEL: test_128_1:			; CHECK-LABEL: test_128_1:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovdqu8 (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x6f,0x07]			; CHECK-NEXT: vmovdqu (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <16 x i8>*			%vaddr = bitcast i8* %addr to <16 x i8>*
	%res = load <16 x i8>, <16 x i8>* %vaddr, align 1			%res = load <16 x i8>, <16 x i8>* %vaddr, align 1
	ret <16 x i8>%res			ret <16 x i8>%res
	}			}

	define void @test_128_2(i8 * %addr, <16 x i8> %data) {			define void @test_128_2(i8 * %addr, <16 x i8> %data) {
	; CHECK-LABEL: test_128_2:			; CHECK-LABEL: test_128_2:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovdqu8 %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7f,0x08,0x7f,0x07]			; CHECK-NEXT: vmovdqu %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <16 x i8>*			%vaddr = bitcast i8* %addr to <16 x i8>*
	store <16 x i8>%data, <16 x i8>* %vaddr, align 1			store <16 x i8>%data, <16 x i8>* %vaddr, align 1
	ret void			ret void
	}			}

	define <16 x i8> @test_128_3(i8 * %addr, <16 x i8> %old, <16 x i8> %mask1) {			define <16 x i8> @test_128_3(i8 * %addr, <16 x i8> %old, <16 x i8> %mask1) {
	; CHECK-LABEL: test_128_3:			; CHECK-LABEL: test_128_3:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqb %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x3f,0xca,0x04]			; CHECK-NEXT: vpcmpneqb %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x3f,0xca,0x04]
	; CHECK-NEXT: vpblendmb (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x66,0x07]			; CHECK-NEXT: vpblendmb (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x66,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <16 x i8> %mask1, zeroinitializer			%mask = icmp ne <16 x i8> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <16 x i8>*			%vaddr = bitcast i8* %addr to <16 x i8>*
	%r = load <16 x i8>, <16 x i8>* %vaddr, align 1			%r = load <16 x i8>, <16 x i8>* %vaddr, align 1
	%res = select <16 x i1> %mask, <16 x i8> %r, <16 x i8> %old			%res = select <16 x i1> %mask, <16 x i8> %r, <16 x i8> %old
	ret <16 x i8>%res			ret <16 x i8>%res
	}			}

	define <16 x i8> @test_128_4(i8 * %addr, <16 x i8> %mask1) {			define <16 x i8> @test_128_4(i8 * %addr, <16 x i8> %mask1) {
	; CHECK-LABEL: test_128_4:			; CHECK-LABEL: test_128_4:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqb %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqb %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x3f,0xc9,0x04]
	; CHECK-NEXT: vmovdqu8 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0x89,0x6f,0x07]			; CHECK-NEXT: vmovdqu8 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7f,0x89,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <16 x i8> %mask1, zeroinitializer			%mask = icmp ne <16 x i8> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <16 x i8>*			%vaddr = bitcast i8* %addr to <16 x i8>*
	%r = load <16 x i8>, <16 x i8>* %vaddr, align 1			%r = load <16 x i8>, <16 x i8>* %vaddr, align 1
	%res = select <16 x i1> %mask, <16 x i8> %r, <16 x i8> zeroinitializer			%res = select <16 x i1> %mask, <16 x i8> %r, <16 x i8> zeroinitializer
	ret <16 x i8>%res			ret <16 x i8>%res
	}			}

	define <8 x i16> @test_128_5(i8 * %addr) {			define <8 x i16> @test_128_5(i8 * %addr) {
	; CHECK-LABEL: test_128_5:			; CHECK-LABEL: test_128_5:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovdqu16 (%rdi), %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0x6f,0x07]			; CHECK-NEXT: vmovdqu (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x i16>*			%vaddr = bitcast i8* %addr to <8 x i16>*
	%res = load <8 x i16>, <8 x i16>* %vaddr, align 1			%res = load <8 x i16>, <8 x i16>* %vaddr, align 1
	ret <8 x i16>%res			ret <8 x i16>%res
	}			}

	define void @test_128_6(i8 * %addr, <8 x i16> %data) {			define void @test_128_6(i8 * %addr, <8 x i16> %data) {
	; CHECK-LABEL: test_128_6:			; CHECK-LABEL: test_128_6:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovdqu16 %xmm0, (%rdi) ## encoding: [0x62,0xf1,0xff,0x08,0x7f,0x07]			; CHECK-NEXT: vmovdqu %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x i16>*			%vaddr = bitcast i8* %addr to <8 x i16>*
	store <8 x i16>%data, <8 x i16>* %vaddr, align 1			store <8 x i16>%data, <8 x i16>* %vaddr, align 1
	ret void			ret void
	}			}

	define <8 x i16> @test_128_7(i8 * %addr, <8 x i16> %old, <8 x i16> %mask1) {			define <8 x i16> @test_128_7(i8 * %addr, <8 x i16> %old, <8 x i16> %mask1) {
	; CHECK-LABEL: test_128_7:			; CHECK-LABEL: test_128_7:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqw %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x3f,0xca,0x04]			; CHECK-NEXT: vpcmpneqw %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x3f,0xca,0x04]
	; CHECK-NEXT: vpblendmw (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x66,0x07]			; CHECK-NEXT: vpblendmw (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x66,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <8 x i16> %mask1, zeroinitializer			%mask = icmp ne <8 x i16> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x i16>*			%vaddr = bitcast i8* %addr to <8 x i16>*
	%r = load <8 x i16>, <8 x i16>* %vaddr, align 1			%r = load <8 x i16>, <8 x i16>* %vaddr, align 1
	%res = select <8 x i1> %mask, <8 x i16> %r, <8 x i16> %old			%res = select <8 x i1> %mask, <8 x i16> %r, <8 x i16> %old
	ret <8 x i16>%res			ret <8 x i16>%res
	}			}

	define <8 x i16> @test_128_8(i8 * %addr, <8 x i16> %mask1) {			define <8 x i16> @test_128_8(i8 * %addr, <8 x i16> %mask1) {
	; CHECK-LABEL: test_128_8:			; CHECK-LABEL: test_128_8:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqw %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqw %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x3f,0xc9,0x04]
	; CHECK-NEXT: vmovdqu16 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0x89,0x6f,0x07]			; CHECK-NEXT: vmovdqu16 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0x89,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <8 x i16> %mask1, zeroinitializer			%mask = icmp ne <8 x i16> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x i16>*			%vaddr = bitcast i8* %addr to <8 x i16>*
	%r = load <8 x i16>, <8 x i16>* %vaddr, align 1			%r = load <8 x i16>, <8 x i16>* %vaddr, align 1
	%res = select <8 x i1> %mask, <8 x i16> %r, <8 x i16> zeroinitializer			%res = select <8 x i1> %mask, <8 x i16> %r, <8 x i16> zeroinitializer
	ret <8 x i16>%res			ret <8 x i16>%res
	}			}

llvm/trunk/test/CodeGen/X86/avx512dqvl-intrinsics-upgrade.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512dq -mattr=+avx512vl --show-mc-encoding\| FileCheck %s		; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512dq -mattr=+avx512vl --show-mc-encoding\| FileCheck %s

define <4 x float> @test_mask_andnot_ps_rr_128(<4 x float> %a, <4 x float> %b) {		define <4 x float> @test_mask_andnot_ps_rr_128(<4 x float> %a, <4 x float> %b) {
; CHECK-LABEL: test_mask_andnot_ps_rr_128:		; CHECK-LABEL: test_mask_andnot_ps_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vandnps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x55,0xc1]		; CHECK-NEXT: vandnps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x55,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_andnot_ps_rrk_128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_andnot_ps_rrk_128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rrk_128:		; CHECK-LABEL: test_mask_andnot_ps_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vandnps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x55,0xd1]		; CHECK-NEXT: vandnps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x55,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_andnot_ps_rrkz_128(<4 x float> %a, <4 x float> %b, i8 %mask) {		define <4 x float> @test_mask_andnot_ps_rrkz_128(<4 x float> %a, <4 x float> %b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rrkz_128:		; CHECK-LABEL: test_mask_andnot_ps_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vandnps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x55,0xc1]		; CHECK-NEXT: vandnps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x55,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_andnot_ps_rm_128(<4 x float> %a, <4 x float>* %ptr_b) {		define <4 x float> @test_mask_andnot_ps_rm_128(<4 x float> %a, <4 x float>* %ptr_b) {
; CHECK-LABEL: test_mask_andnot_ps_rm_128:		; CHECK-LABEL: test_mask_andnot_ps_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vandnps (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x55,0x07]		; CHECK-NEXT: vandnps (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x55,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x float>, <4 x float>* %ptr_b		%b = load <4 x float>, <4 x float>* %ptr_b
%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_andnot_ps_rmk_128(<4 x float> %a, <4 x float>* %ptr_b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_andnot_ps_rmk_128(<4 x float> %a, <4 x float>* %ptr_b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rmk_128:		; CHECK-LABEL: test_mask_andnot_ps_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vandnps (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x55,0x0f]		; CHECK-NEXT: vandnps (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x55,0x0f]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x float>, <4 x float>* %ptr_b		%b = load <4 x float>, <4 x float>* %ptr_b
%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_andnot_ps_rmkz_128(<4 x float> %a, <4 x float>* %ptr_b, i8 %mask) {		define <4 x float> @test_mask_andnot_ps_rmkz_128(<4 x float> %a, <4 x float>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rmkz_128:		; CHECK-LABEL: test_mask_andnot_ps_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_andnot_ps_rmbk_128(<4 x float> %a, float* %ptr_b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_andnot_ps_rmbk_128(<4 x float> %a, float* %ptr_b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rmbk_128:		; CHECK-LABEL: test_mask_andnot_ps_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vandnps (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x19,0x55,0x0f]		; CHECK-NEXT: vandnps (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x19,0x55,0x0f]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load float, float* %ptr_b		%q = load float, float* %ptr_b
%vecinit.i = insertelement <4 x float> undef, float %q, i32 0		%vecinit.i = insertelement <4 x float> undef, float %q, i32 0
%b = shufflevector <4 x float> %vecinit.i, <4 x float> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x float> %vecinit.i, <4 x float> undef, <4 x i32> zeroinitializer
%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

declare <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.andn.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <8 x float> @test_mask_andnot_ps_rr_256(<8 x float> %a, <8 x float> %b) {		define <8 x float> @test_mask_andnot_ps_rr_256(<8 x float> %a, <8 x float> %b) {
; CHECK-LABEL: test_mask_andnot_ps_rr_256:		; CHECK-LABEL: test_mask_andnot_ps_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vandnps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x55,0xc1]		; CHECK-NEXT: vandnps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x55,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_andnot_ps_rrk_256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_andnot_ps_rrk_256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rrk_256:		; CHECK-LABEL: test_mask_andnot_ps_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vandnps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x55,0xd1]		; CHECK-NEXT: vandnps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x55,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_andnot_ps_rrkz_256(<8 x float> %a, <8 x float> %b, i8 %mask) {		define <8 x float> @test_mask_andnot_ps_rrkz_256(<8 x float> %a, <8 x float> %b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rrkz_256:		; CHECK-LABEL: test_mask_andnot_ps_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vandnps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x55,0xc1]		; CHECK-NEXT: vandnps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x55,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_andnot_ps_rm_256(<8 x float> %a, <8 x float>* %ptr_b) {		define <8 x float> @test_mask_andnot_ps_rm_256(<8 x float> %a, <8 x float>* %ptr_b) {
; CHECK-LABEL: test_mask_andnot_ps_rm_256:		; CHECK-LABEL: test_mask_andnot_ps_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vandnps (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x55,0x07]		; CHECK-NEXT: vandnps (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x55,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x float>, <8 x float>* %ptr_b		%b = load <8 x float>, <8 x float>* %ptr_b
%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_andnot_ps_rmk_256(<8 x float> %a, <8 x float>* %ptr_b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_andnot_ps_rmk_256(<8 x float> %a, <8 x float>* %ptr_b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rmk_256:		; CHECK-LABEL: test_mask_andnot_ps_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vandnps (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x55,0x0f]		; CHECK-NEXT: vandnps (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x55,0x0f]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x float>, <8 x float>* %ptr_b		%b = load <8 x float>, <8 x float>* %ptr_b
%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_andnot_ps_rmkz_256(<8 x float> %a, <8 x float>* %ptr_b, i8 %mask) {		define <8 x float> @test_mask_andnot_ps_rmkz_256(<8 x float> %a, <8 x float>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rmkz_256:		; CHECK-LABEL: test_mask_andnot_ps_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_andnot_ps_rmbk_256(<8 x float> %a, float* %ptr_b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_andnot_ps_rmbk_256(<8 x float> %a, float* %ptr_b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_ps_rmbk_256:		; CHECK-LABEL: test_mask_andnot_ps_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vandnps (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x39,0x55,0x0f]		; CHECK-NEXT: vandnps (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x39,0x55,0x0f]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load float, float* %ptr_b		%q = load float, float* %ptr_b
%vecinit.i = insertelement <8 x float> undef, float %q, i32 0		%vecinit.i = insertelement <8 x float> undef, float %q, i32 0
%b = shufflevector <8 x float> %vecinit.i, <8 x float> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x float> %vecinit.i, <8 x float> undef, <8 x i32> zeroinitializer
%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.andn.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <16 x float> %res		ret <16 x float> %res
}		}

declare <16 x float> @llvm.x86.avx512.mask.andn.ps.512(<16 x float>, <16 x float>, <16 x float>, i16)		declare <16 x float> @llvm.x86.avx512.mask.andn.ps.512(<16 x float>, <16 x float>, <16 x float>, i16)

define <4 x float> @test_mask_and_ps_rr_128(<4 x float> %a, <4 x float> %b) {		define <4 x float> @test_mask_and_ps_rr_128(<4 x float> %a, <4 x float> %b) {
; CHECK-LABEL: test_mask_and_ps_rr_128:		; CHECK-LABEL: test_mask_and_ps_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vandps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x54,0xc1]		; CHECK-NEXT: vandps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x54,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_and_ps_rrk_128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_and_ps_rrk_128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rrk_128:		; CHECK-LABEL: test_mask_and_ps_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vandps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x54,0xd1]		; CHECK-NEXT: vandps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x54,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_and_ps_rrkz_128(<4 x float> %a, <4 x float> %b, i8 %mask) {		define <4 x float> @test_mask_and_ps_rrkz_128(<4 x float> %a, <4 x float> %b, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rrkz_128:		; CHECK-LABEL: test_mask_and_ps_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vandps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x54,0xc1]		; CHECK-NEXT: vandps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x54,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_and_ps_rm_128(<4 x float> %a, <4 x float>* %ptr_b) {		define <4 x float> @test_mask_and_ps_rm_128(<4 x float> %a, <4 x float>* %ptr_b) {
; CHECK-LABEL: test_mask_and_ps_rm_128:		; CHECK-LABEL: test_mask_and_ps_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vandps (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x54,0x07]		; CHECK-NEXT: vandps (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x54,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x float>, <4 x float>* %ptr_b		%b = load <4 x float>, <4 x float>* %ptr_b
%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_and_ps_rmk_128(<4 x float> %a, <4 x float>* %ptr_b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_and_ps_rmk_128(<4 x float> %a, <4 x float>* %ptr_b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rmk_128:		; CHECK-LABEL: test_mask_and_ps_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vandps (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x54,0x0f]		; CHECK-NEXT: vandps (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x54,0x0f]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x float>, <4 x float>* %ptr_b		%b = load <4 x float>, <4 x float>* %ptr_b
%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_and_ps_rmkz_128(<4 x float> %a, <4 x float>* %ptr_b, i8 %mask) {		define <4 x float> @test_mask_and_ps_rmkz_128(<4 x float> %a, <4 x float>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rmkz_128:		; CHECK-LABEL: test_mask_and_ps_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_and_ps_rmbk_128(<4 x float> %a, float* %ptr_b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_and_ps_rmbk_128(<4 x float> %a, float* %ptr_b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rmbk_128:		; CHECK-LABEL: test_mask_and_ps_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vandps (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x19,0x54,0x0f]		; CHECK-NEXT: vandps (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x19,0x54,0x0f]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load float, float* %ptr_b		%q = load float, float* %ptr_b
%vecinit.i = insertelement <4 x float> undef, float %q, i32 0		%vecinit.i = insertelement <4 x float> undef, float %q, i32 0
%b = shufflevector <4 x float> %vecinit.i, <4 x float> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x float> %vecinit.i, <4 x float> undef, <4 x i32> zeroinitializer
%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

declare <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.and.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <8 x float> @test_mask_and_ps_rr_256(<8 x float> %a, <8 x float> %b) {		define <8 x float> @test_mask_and_ps_rr_256(<8 x float> %a, <8 x float> %b) {
; CHECK-LABEL: test_mask_and_ps_rr_256:		; CHECK-LABEL: test_mask_and_ps_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vandps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x54,0xc1]		; CHECK-NEXT: vandps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x54,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_and_ps_rrk_256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_and_ps_rrk_256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rrk_256:		; CHECK-LABEL: test_mask_and_ps_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vandps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x54,0xd1]		; CHECK-NEXT: vandps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x54,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_and_ps_rrkz_256(<8 x float> %a, <8 x float> %b, i8 %mask) {		define <8 x float> @test_mask_and_ps_rrkz_256(<8 x float> %a, <8 x float> %b, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rrkz_256:		; CHECK-LABEL: test_mask_and_ps_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vandps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x54,0xc1]		; CHECK-NEXT: vandps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x54,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_and_ps_rm_256(<8 x float> %a, <8 x float>* %ptr_b) {		define <8 x float> @test_mask_and_ps_rm_256(<8 x float> %a, <8 x float>* %ptr_b) {
; CHECK-LABEL: test_mask_and_ps_rm_256:		; CHECK-LABEL: test_mask_and_ps_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vandps (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x54,0x07]		; CHECK-NEXT: vandps (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x54,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x float>, <8 x float>* %ptr_b		%b = load <8 x float>, <8 x float>* %ptr_b
%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_and_ps_rmk_256(<8 x float> %a, <8 x float>* %ptr_b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_and_ps_rmk_256(<8 x float> %a, <8 x float>* %ptr_b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rmk_256:		; CHECK-LABEL: test_mask_and_ps_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vandps (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x54,0x0f]		; CHECK-NEXT: vandps (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x54,0x0f]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x float>, <8 x float>* %ptr_b		%b = load <8 x float>, <8 x float>* %ptr_b
%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_and_ps_rmkz_256(<8 x float> %a, <8 x float>* %ptr_b, i8 %mask) {		define <8 x float> @test_mask_and_ps_rmkz_256(<8 x float> %a, <8 x float>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rmkz_256:		; CHECK-LABEL: test_mask_and_ps_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_and_ps_rmbk_256(<8 x float> %a, float* %ptr_b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_and_ps_rmbk_256(<8 x float> %a, float* %ptr_b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_ps_rmbk_256:		; CHECK-LABEL: test_mask_and_ps_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vandps (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x39,0x54,0x0f]		; CHECK-NEXT: vandps (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x39,0x54,0x0f]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load float, float* %ptr_b		%q = load float, float* %ptr_b
%vecinit.i = insertelement <8 x float> undef, float %q, i32 0		%vecinit.i = insertelement <8 x float> undef, float %q, i32 0
%b = shufflevector <8 x float> %vecinit.i, <8 x float> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x float> %vecinit.i, <8 x float> undef, <8 x i32> zeroinitializer
%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.and.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <16 x float> %res		ret <16 x float> %res
}		}

declare <16 x float> @llvm.x86.avx512.mask.and.ps.512(<16 x float>, <16 x float>, <16 x float>, i16)		declare <16 x float> @llvm.x86.avx512.mask.and.ps.512(<16 x float>, <16 x float>, <16 x float>, i16)

define <4 x float> @test_mask_or_ps_rr_128(<4 x float> %a, <4 x float> %b) {		define <4 x float> @test_mask_or_ps_rr_128(<4 x float> %a, <4 x float> %b) {
; CHECK-LABEL: test_mask_or_ps_rr_128:		; CHECK-LABEL: test_mask_or_ps_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vorps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x56,0xc1]		; CHECK-NEXT: vorps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x56,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_or_ps_rrk_128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_or_ps_rrk_128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rrk_128:		; CHECK-LABEL: test_mask_or_ps_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vorps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x56,0xd1]		; CHECK-NEXT: vorps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x56,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_or_ps_rrkz_128(<4 x float> %a, <4 x float> %b, i8 %mask) {		define <4 x float> @test_mask_or_ps_rrkz_128(<4 x float> %a, <4 x float> %b, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rrkz_128:		; CHECK-LABEL: test_mask_or_ps_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vorps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x56,0xc1]		; CHECK-NEXT: vorps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x56,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_or_ps_rm_128(<4 x float> %a, <4 x float>* %ptr_b) {		define <4 x float> @test_mask_or_ps_rm_128(<4 x float> %a, <4 x float>* %ptr_b) {
; CHECK-LABEL: test_mask_or_ps_rm_128:		; CHECK-LABEL: test_mask_or_ps_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vorps (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x56,0x07]		; CHECK-NEXT: vorps (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x56,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x float>, <4 x float>* %ptr_b		%b = load <4 x float>, <4 x float>* %ptr_b
%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_or_ps_rmk_128(<4 x float> %a, <4 x float>* %ptr_b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_or_ps_rmk_128(<4 x float> %a, <4 x float>* %ptr_b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rmk_128:		; CHECK-LABEL: test_mask_or_ps_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vorps (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x56,0x0f]		; CHECK-NEXT: vorps (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x56,0x0f]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x float>, <4 x float>* %ptr_b		%b = load <4 x float>, <4 x float>* %ptr_b
%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_or_ps_rmkz_128(<4 x float> %a, <4 x float>* %ptr_b, i8 %mask) {		define <4 x float> @test_mask_or_ps_rmkz_128(<4 x float> %a, <4 x float>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rmkz_128:		; CHECK-LABEL: test_mask_or_ps_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_or_ps_rmbk_128(<4 x float> %a, float* %ptr_b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_or_ps_rmbk_128(<4 x float> %a, float* %ptr_b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rmbk_128:		; CHECK-LABEL: test_mask_or_ps_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vorps (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x19,0x56,0x0f]		; CHECK-NEXT: vorps (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x19,0x56,0x0f]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load float, float* %ptr_b		%q = load float, float* %ptr_b
%vecinit.i = insertelement <4 x float> undef, float %q, i32 0		%vecinit.i = insertelement <4 x float> undef, float %q, i32 0
%b = shufflevector <4 x float> %vecinit.i, <4 x float> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x float> %vecinit.i, <4 x float> undef, <4 x i32> zeroinitializer
%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

declare <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.or.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <8 x float> @test_mask_or_ps_rr_256(<8 x float> %a, <8 x float> %b) {		define <8 x float> @test_mask_or_ps_rr_256(<8 x float> %a, <8 x float> %b) {
; CHECK-LABEL: test_mask_or_ps_rr_256:		; CHECK-LABEL: test_mask_or_ps_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vorps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x56,0xc1]		; CHECK-NEXT: vorps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x56,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_or_ps_rrk_256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_or_ps_rrk_256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rrk_256:		; CHECK-LABEL: test_mask_or_ps_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vorps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x56,0xd1]		; CHECK-NEXT: vorps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x56,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_or_ps_rrkz_256(<8 x float> %a, <8 x float> %b, i8 %mask) {		define <8 x float> @test_mask_or_ps_rrkz_256(<8 x float> %a, <8 x float> %b, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rrkz_256:		; CHECK-LABEL: test_mask_or_ps_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vorps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x56,0xc1]		; CHECK-NEXT: vorps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x56,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_or_ps_rm_256(<8 x float> %a, <8 x float>* %ptr_b) {		define <8 x float> @test_mask_or_ps_rm_256(<8 x float> %a, <8 x float>* %ptr_b) {
; CHECK-LABEL: test_mask_or_ps_rm_256:		; CHECK-LABEL: test_mask_or_ps_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vorps (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x56,0x07]		; CHECK-NEXT: vorps (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x56,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x float>, <8 x float>* %ptr_b		%b = load <8 x float>, <8 x float>* %ptr_b
%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_or_ps_rmk_256(<8 x float> %a, <8 x float>* %ptr_b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_or_ps_rmk_256(<8 x float> %a, <8 x float>* %ptr_b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rmk_256:		; CHECK-LABEL: test_mask_or_ps_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vorps (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x56,0x0f]		; CHECK-NEXT: vorps (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x56,0x0f]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x float>, <8 x float>* %ptr_b		%b = load <8 x float>, <8 x float>* %ptr_b
%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_or_ps_rmkz_256(<8 x float> %a, <8 x float>* %ptr_b, i8 %mask) {		define <8 x float> @test_mask_or_ps_rmkz_256(<8 x float> %a, <8 x float>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rmkz_256:		; CHECK-LABEL: test_mask_or_ps_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_or_ps_rmbk_256(<8 x float> %a, float* %ptr_b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_or_ps_rmbk_256(<8 x float> %a, float* %ptr_b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_ps_rmbk_256:		; CHECK-LABEL: test_mask_or_ps_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vorps (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x39,0x56,0x0f]		; CHECK-NEXT: vorps (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x39,0x56,0x0f]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load float, float* %ptr_b		%q = load float, float* %ptr_b
%vecinit.i = insertelement <8 x float> undef, float %q, i32 0		%vecinit.i = insertelement <8 x float> undef, float %q, i32 0
%b = shufflevector <8 x float> %vecinit.i, <8 x float> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x float> %vecinit.i, <8 x float> undef, <8 x i32> zeroinitializer
%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.or.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <16 x float> %res		ret <16 x float> %res
}		}

declare <16 x float> @llvm.x86.avx512.mask.or.ps.512(<16 x float>, <16 x float>, <16 x float>, i16)		declare <16 x float> @llvm.x86.avx512.mask.or.ps.512(<16 x float>, <16 x float>, <16 x float>, i16)

define <4 x float> @test_mask_xor_ps_rr_128(<4 x float> %a, <4 x float> %b) {		define <4 x float> @test_mask_xor_ps_rr_128(<4 x float> %a, <4 x float> %b) {
; CHECK-LABEL: test_mask_xor_ps_rr_128:		; CHECK-LABEL: test_mask_xor_ps_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vxorps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x57,0xc1]		; CHECK-NEXT: vxorps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x57,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_xor_ps_rrk_128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_xor_ps_rrk_128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rrk_128:		; CHECK-LABEL: test_mask_xor_ps_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vxorps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x57,0xd1]		; CHECK-NEXT: vxorps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x57,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_xor_ps_rrkz_128(<4 x float> %a, <4 x float> %b, i8 %mask) {		define <4 x float> @test_mask_xor_ps_rrkz_128(<4 x float> %a, <4 x float> %b, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rrkz_128:		; CHECK-LABEL: test_mask_xor_ps_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vxorps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x57,0xc1]		; CHECK-NEXT: vxorps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x57,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_xor_ps_rm_128(<4 x float> %a, <4 x float>* %ptr_b) {		define <4 x float> @test_mask_xor_ps_rm_128(<4 x float> %a, <4 x float>* %ptr_b) {
; CHECK-LABEL: test_mask_xor_ps_rm_128:		; CHECK-LABEL: test_mask_xor_ps_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vxorps (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x57,0x07]		; CHECK-NEXT: vxorps (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x57,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x float>, <4 x float>* %ptr_b		%b = load <4 x float>, <4 x float>* %ptr_b
%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_xor_ps_rmk_128(<4 x float> %a, <4 x float>* %ptr_b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_xor_ps_rmk_128(<4 x float> %a, <4 x float>* %ptr_b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rmk_128:		; CHECK-LABEL: test_mask_xor_ps_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vxorps (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x57,0x0f]		; CHECK-NEXT: vxorps (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x57,0x0f]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x float>, <4 x float>* %ptr_b		%b = load <4 x float>, <4 x float>* %ptr_b
%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_xor_ps_rmkz_128(<4 x float> %a, <4 x float>* %ptr_b, i8 %mask) {		define <4 x float> @test_mask_xor_ps_rmkz_128(<4 x float> %a, <4 x float>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rmkz_128:		; CHECK-LABEL: test_mask_xor_ps_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mask_xor_ps_rmbk_128(<4 x float> %a, float* %ptr_b, <4 x float> %passThru, i8 %mask) {		define <4 x float> @test_mask_xor_ps_rmbk_128(<4 x float> %a, float* %ptr_b, <4 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rmbk_128:		; CHECK-LABEL: test_mask_xor_ps_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vxorps (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x19,0x57,0x0f]		; CHECK-NEXT: vxorps (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x19,0x57,0x0f]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load float, float* %ptr_b		%q = load float, float* %ptr_b
%vecinit.i = insertelement <4 x float> undef, float %q, i32 0		%vecinit.i = insertelement <4 x float> undef, float %q, i32 0
%b = shufflevector <4 x float> %vecinit.i, <4 x float> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x float> %vecinit.i, <4 x float> undef, <4 x i32> zeroinitializer
%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float> %a, <4 x float> %b, <4 x float> %passThru, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

declare <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.xor.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <8 x float> @test_mask_xor_ps_rr_256(<8 x float> %a, <8 x float> %b) {		define <8 x float> @test_mask_xor_ps_rr_256(<8 x float> %a, <8 x float> %b) {
; CHECK-LABEL: test_mask_xor_ps_rr_256:		; CHECK-LABEL: test_mask_xor_ps_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vxorps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x57,0xc1]		; CHECK-NEXT: vxorps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x57,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_xor_ps_rrk_256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_xor_ps_rrk_256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rrk_256:		; CHECK-LABEL: test_mask_xor_ps_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vxorps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x57,0xd1]		; CHECK-NEXT: vxorps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x57,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_xor_ps_rrkz_256(<8 x float> %a, <8 x float> %b, i8 %mask) {		define <8 x float> @test_mask_xor_ps_rrkz_256(<8 x float> %a, <8 x float> %b, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rrkz_256:		; CHECK-LABEL: test_mask_xor_ps_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vxorps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x57,0xc1]		; CHECK-NEXT: vxorps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x57,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_xor_ps_rm_256(<8 x float> %a, <8 x float>* %ptr_b) {		define <8 x float> @test_mask_xor_ps_rm_256(<8 x float> %a, <8 x float>* %ptr_b) {
; CHECK-LABEL: test_mask_xor_ps_rm_256:		; CHECK-LABEL: test_mask_xor_ps_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vxorps (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x57,0x07]		; CHECK-NEXT: vxorps (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x57,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x float>, <8 x float>* %ptr_b		%b = load <8 x float>, <8 x float>* %ptr_b
%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_xor_ps_rmk_256(<8 x float> %a, <8 x float>* %ptr_b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_xor_ps_rmk_256(<8 x float> %a, <8 x float>* %ptr_b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rmk_256:		; CHECK-LABEL: test_mask_xor_ps_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vxorps (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x57,0x0f]		; CHECK-NEXT: vxorps (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x57,0x0f]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x float>, <8 x float>* %ptr_b		%b = load <8 x float>, <8 x float>* %ptr_b
%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_xor_ps_rmkz_256(<8 x float> %a, <8 x float>* %ptr_b, i8 %mask) {		define <8 x float> @test_mask_xor_ps_rmkz_256(<8 x float> %a, <8 x float>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rmkz_256:		; CHECK-LABEL: test_mask_xor_ps_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mask_xor_ps_rmbk_256(<8 x float> %a, float* %ptr_b, <8 x float> %passThru, i8 %mask) {		define <8 x float> @test_mask_xor_ps_rmbk_256(<8 x float> %a, float* %ptr_b, <8 x float> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_ps_rmbk_256:		; CHECK-LABEL: test_mask_xor_ps_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vxorps (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x39,0x57,0x0f]		; CHECK-NEXT: vxorps (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x39,0x57,0x0f]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load float, float* %ptr_b		%q = load float, float* %ptr_b
%vecinit.i = insertelement <8 x float> undef, float %q, i32 0		%vecinit.i = insertelement <8 x float> undef, float %q, i32 0
%b = shufflevector <8 x float> %vecinit.i, <8 x float> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x float> %vecinit.i, <8 x float> undef, <8 x i32> zeroinitializer
%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.xor.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %passThru, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

▲ Show 20 Lines • Show All 228 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_mullo_epi64_rrk_256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask) {		define <4 x i64> @test_mask_mullo_epi64_rrk_256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rrk_256:		; CHECK-LABEL: test_mask_mullo_epi64_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vpmullq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x40,0xd1]		; CHECK-NEXT: vpmullq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x40,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmull.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmull.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_mullo_epi64_rrkz_256(<4 x i64> %a, <4 x i64> %b, i8 %mask) {		define <4 x i64> @test_mask_mullo_epi64_rrkz_256(<4 x i64> %a, <4 x i64> %b, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rrkz_256:		; CHECK-LABEL: test_mask_mullo_epi64_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 14 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_mullo_epi64_rmk_256(<4 x i64> %a, <4 x i64>* %ptr_b, <4 x i64> %passThru, i8 %mask) {		define <4 x i64> @test_mask_mullo_epi64_rmk_256(<4 x i64> %a, <4 x i64>* %ptr_b, <4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rmk_256:		; CHECK-LABEL: test_mask_mullo_epi64_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vpmullq (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x40,0x0f]		; CHECK-NEXT: vpmullq (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x40,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i64>, <4 x i64>* %ptr_b		%b = load <4 x i64>, <4 x i64>* %ptr_b
%res = call <4 x i64> @llvm.x86.avx512.mask.pmull.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmull.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_mullo_epi64_rmkz_256(<4 x i64> %a, <4 x i64>* %ptr_b, i8 %mask) {		define <4 x i64> @test_mask_mullo_epi64_rmkz_256(<4 x i64> %a, <4 x i64>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rmkz_256:		; CHECK-LABEL: test_mask_mullo_epi64_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_mullo_epi64_rmbk_256(<4 x i64> %a, i64* %ptr_b, <4 x i64> %passThru, i8 %mask) {		define <4 x i64> @test_mask_mullo_epi64_rmbk_256(<4 x i64> %a, i64* %ptr_b, <4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rmbk_256:		; CHECK-LABEL: test_mask_mullo_epi64_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vpmullq (%rdi){1to4}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x39,0x40,0x0f]		; CHECK-NEXT: vpmullq (%rdi){1to4}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x39,0x40,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i64, i64* %ptr_b		%q = load i64, i64* %ptr_b
%vecinit.i = insertelement <4 x i64> undef, i64 %q, i32 0		%vecinit.i = insertelement <4 x i64> undef, i64 %q, i32 0
%b = shufflevector <4 x i64> %vecinit.i, <4 x i64> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i64> %vecinit.i, <4 x i64> undef, <4 x i32> zeroinitializer
%res = call <4 x i64> @llvm.x86.avx512.mask.pmull.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmull.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)
ret <4 x i64> %res		ret <4 x i64> %res
}		}

Show All 21 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_mullo_epi64_rrk_128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask) {		define <2 x i64> @test_mask_mullo_epi64_rrk_128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rrk_128:		; CHECK-LABEL: test_mask_mullo_epi64_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]		; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
; CHECK-NEXT: vpmullq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x40,0xd1]		; CHECK-NEXT: vpmullq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x40,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmull.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmull.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_mullo_epi64_rrkz_128(<2 x i64> %a, <2 x i64> %b, i8 %mask) {		define <2 x i64> @test_mask_mullo_epi64_rrkz_128(<2 x i64> %a, <2 x i64> %b, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rrkz_128:		; CHECK-LABEL: test_mask_mullo_epi64_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 14 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_mullo_epi64_rmk_128(<2 x i64> %a, <2 x i64>* %ptr_b, <2 x i64> %passThru, i8 %mask) {		define <2 x i64> @test_mask_mullo_epi64_rmk_128(<2 x i64> %a, <2 x i64>* %ptr_b, <2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rmk_128:		; CHECK-LABEL: test_mask_mullo_epi64_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vpmullq (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x40,0x0f]		; CHECK-NEXT: vpmullq (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x40,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <2 x i64>, <2 x i64>* %ptr_b		%b = load <2 x i64>, <2 x i64>* %ptr_b
%res = call <2 x i64> @llvm.x86.avx512.mask.pmull.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmull.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_mullo_epi64_rmkz_128(<2 x i64> %a, <2 x i64>* %ptr_b, i8 %mask) {		define <2 x i64> @test_mask_mullo_epi64_rmkz_128(<2 x i64> %a, <2 x i64>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rmkz_128:		; CHECK-LABEL: test_mask_mullo_epi64_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_mullo_epi64_rmbk_128(<2 x i64> %a, i64* %ptr_b, <2 x i64> %passThru, i8 %mask) {		define <2 x i64> @test_mask_mullo_epi64_rmbk_128(<2 x i64> %a, i64* %ptr_b, <2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mullo_epi64_rmbk_128:		; CHECK-LABEL: test_mask_mullo_epi64_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]		; CHECK-NEXT: kmovb %esi, %k1 ## encoding: [0xc5,0xf9,0x92,0xce]
; CHECK-NEXT: vpmullq (%rdi){1to2}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x19,0x40,0x0f]		; CHECK-NEXT: vpmullq (%rdi){1to2}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x19,0x40,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i64, i64* %ptr_b		%q = load i64, i64* %ptr_b
%vecinit.i = insertelement <2 x i64> undef, i64 %q, i32 0		%vecinit.i = insertelement <2 x i64> undef, i64 %q, i32 0
%b = shufflevector <2 x i64> %vecinit.i, <2 x i64> undef, <2 x i32> zeroinitializer		%b = shufflevector <2 x i64> %vecinit.i, <2 x i64> undef, <2 x i32> zeroinitializer
%res = call <2 x i64> @llvm.x86.avx512.mask.pmull.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmull.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)
ret <2 x i64> %res		ret <2 x i64> %res
}		}

Show All 15 Lines

llvm/trunk/test/CodeGen/X86/avx512dqvl-intrinsics.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512dq -mattr=+avx512vl --show-mc-encoding\| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512dq -mattr=+avx512vl --show-mc-encoding\| FileCheck %s

	declare <2 x i64> @llvm.x86.avx512.mask.cvtpd2qq.128(<2 x double>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.cvtpd2qq.128(<2 x double>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_cvt_pd2qq_128(<2 x double> %x0, <2 x i64> %x1, i8 %x2) {			define <2 x i64>@test_int_x86_avx512_mask_cvt_pd2qq_128(<2 x double> %x0, <2 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2qq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2qq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtpd2qq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x7b,0xc8]			; CHECK-NEXT: vcvtpd2qq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x7b,0xc8]
	; CHECK-NEXT: vcvtpd2qq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x7b,0xc0]			; CHECK-NEXT: vcvtpd2qq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x7b,0xc0]
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx512.mask.cvtpd2qq.128(<2 x double> %x0, <2 x i64> %x1, i8 %x2)			%res = call <2 x i64> @llvm.x86.avx512.mask.cvtpd2qq.128(<2 x double> %x0, <2 x i64> %x1, i8 %x2)
	%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvtpd2qq.128(<2 x double> %x0, <2 x i64> %x1, i8 -1)			%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvtpd2qq.128(<2 x double> %x0, <2 x i64> %x1, i8 -1)
	%res2 = add <2 x i64> %res, %res1			%res2 = add <2 x i64> %res, %res1
	ret <2 x i64> %res2			ret <2 x i64> %res2
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.cvtpd2qq.256(<4 x double>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.cvtpd2qq.256(<4 x double>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_cvt_pd2qq_256(<4 x double> %x0, <4 x i64> %x1, i8 %x2) {			define <4 x i64>@test_int_x86_avx512_mask_cvt_pd2qq_256(<4 x double> %x0, <4 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2qq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2qq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtpd2qq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x7b,0xc8]			; CHECK-NEXT: vcvtpd2qq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x7b,0xc8]
	; CHECK-NEXT: vcvtpd2qq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x7b,0xc0]			; CHECK-NEXT: vcvtpd2qq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x7b,0xc0]
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.cvtpd2qq.256(<4 x double> %x0, <4 x i64> %x1, i8 %x2)			%res = call <4 x i64> @llvm.x86.avx512.mask.cvtpd2qq.256(<4 x double> %x0, <4 x i64> %x1, i8 %x2)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvtpd2qq.256(<4 x double> %x0, <4 x i64> %x1, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvtpd2qq.256(<4 x double> %x0, <4 x i64> %x1, i8 -1)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	ret <4 x i64> %res2			ret <4 x i64> %res2
	}			}

	declare <2 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.128(<2 x double>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.128(<2 x double>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_cvt_pd2uqq_128(<2 x double> %x0, <2 x i64> %x1, i8 %x2) {			define <2 x i64>@test_int_x86_avx512_mask_cvt_pd2uqq_128(<2 x double> %x0, <2 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2uqq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2uqq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtpd2uqq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x79,0xc8]			; CHECK-NEXT: vcvtpd2uqq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x79,0xc8]
	; CHECK-NEXT: vcvtpd2uqq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x79,0xc0]			; CHECK-NEXT: vcvtpd2uqq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x79,0xc0]
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.128(<2 x double> %x0, <2 x i64> %x1, i8 %x2)			%res = call <2 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.128(<2 x double> %x0, <2 x i64> %x1, i8 %x2)
	%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.128(<2 x double> %x0, <2 x i64> %x1, i8 -1)			%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.128(<2 x double> %x0, <2 x i64> %x1, i8 -1)
	%res2 = add <2 x i64> %res, %res1			%res2 = add <2 x i64> %res, %res1
	ret <2 x i64> %res2			ret <2 x i64> %res2
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.256(<4 x double>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.256(<4 x double>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_cvt_pd2uqq_256(<4 x double> %x0, <4 x i64> %x1, i8 %x2) {			define <4 x i64>@test_int_x86_avx512_mask_cvt_pd2uqq_256(<4 x double> %x0, <4 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2uqq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2uqq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtpd2uqq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x79,0xc8]			; CHECK-NEXT: vcvtpd2uqq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x79,0xc8]
	; CHECK-NEXT: vcvtpd2uqq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x79,0xc0]			; CHECK-NEXT: vcvtpd2uqq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x79,0xc0]
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.256(<4 x double> %x0, <4 x i64> %x1, i8 %x2)			%res = call <4 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.256(<4 x double> %x0, <4 x i64> %x1, i8 %x2)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.256(<4 x double> %x0, <4 x i64> %x1, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvtpd2uqq.256(<4 x double> %x0, <4 x i64> %x1, i8 -1)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	ret <4 x i64> %res2			ret <4 x i64> %res2
	}			}

	declare <2 x i64> @llvm.x86.avx512.mask.cvtps2qq.128(<4 x float>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.cvtps2qq.128(<4 x float>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_cvt_ps2qq_128(<4 x float> %x0, <2 x i64> %x1, i8 %x2) {			define <2 x i64>@test_int_x86_avx512_mask_cvt_ps2qq_128(<4 x float> %x0, <2 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2qq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2qq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtps2qq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x7b,0xc8]			; CHECK-NEXT: vcvtps2qq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x7b,0xc8]
	; CHECK-NEXT: vcvtps2qq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x7b,0xc0]			; CHECK-NEXT: vcvtps2qq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x7b,0xc0]
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx512.mask.cvtps2qq.128(<4 x float> %x0, <2 x i64> %x1, i8 %x2)			%res = call <2 x i64> @llvm.x86.avx512.mask.cvtps2qq.128(<4 x float> %x0, <2 x i64> %x1, i8 %x2)
	%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvtps2qq.128(<4 x float> %x0, <2 x i64> %x1, i8 -1)			%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvtps2qq.128(<4 x float> %x0, <2 x i64> %x1, i8 -1)
	%res2 = add <2 x i64> %res, %res1			%res2 = add <2 x i64> %res, %res1
	ret <2 x i64> %res2			ret <2 x i64> %res2
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.cvtps2qq.256(<4 x float>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.cvtps2qq.256(<4 x float>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_cvt_ps2qq_256(<4 x float> %x0, <4 x i64> %x1, i8 %x2) {			define <4 x i64>@test_int_x86_avx512_mask_cvt_ps2qq_256(<4 x float> %x0, <4 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2qq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2qq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtps2qq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x7b,0xc8]			; CHECK-NEXT: vcvtps2qq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x7b,0xc8]
	; CHECK-NEXT: vcvtps2qq %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x7b,0xc0]			; CHECK-NEXT: vcvtps2qq %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x7b,0xc0]
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.cvtps2qq.256(<4 x float> %x0, <4 x i64> %x1, i8 %x2)			%res = call <4 x i64> @llvm.x86.avx512.mask.cvtps2qq.256(<4 x float> %x0, <4 x i64> %x1, i8 %x2)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvtps2qq.256(<4 x float> %x0, <4 x i64> %x1, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvtps2qq.256(<4 x float> %x0, <4 x i64> %x1, i8 -1)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	ret <4 x i64> %res2			ret <4 x i64> %res2
	}			}

	declare <2 x i64> @llvm.x86.avx512.mask.cvtps2uqq.128(<4 x float>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.cvtps2uqq.128(<4 x float>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_cvt_ps2uqq_128(<4 x float> %x0, <2 x i64> %x1, i8 %x2) {			define <2 x i64>@test_int_x86_avx512_mask_cvt_ps2uqq_128(<4 x float> %x0, <2 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2uqq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2uqq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtps2uqq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x79,0xc8]			; CHECK-NEXT: vcvtps2uqq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x79,0xc8]
	; CHECK-NEXT: vcvtps2uqq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x79,0xc0]			; CHECK-NEXT: vcvtps2uqq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x79,0xc0]
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx512.mask.cvtps2uqq.128(<4 x float> %x0, <2 x i64> %x1, i8 %x2)			%res = call <2 x i64> @llvm.x86.avx512.mask.cvtps2uqq.128(<4 x float> %x0, <2 x i64> %x1, i8 %x2)
	%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvtps2uqq.128(<4 x float> %x0, <2 x i64> %x1, i8 -1)			%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvtps2uqq.128(<4 x float> %x0, <2 x i64> %x1, i8 -1)
	%res2 = add <2 x i64> %res, %res1			%res2 = add <2 x i64> %res, %res1
	ret <2 x i64> %res2			ret <2 x i64> %res2
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.cvtps2uqq.256(<4 x float>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.cvtps2uqq.256(<4 x float>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_cvt_ps2uqq_256(<4 x float> %x0, <4 x i64> %x1, i8 %x2) {			define <4 x i64>@test_int_x86_avx512_mask_cvt_ps2uqq_256(<4 x float> %x0, <4 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2uqq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2uqq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtps2uqq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x79,0xc8]			; CHECK-NEXT: vcvtps2uqq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x79,0xc8]
	; CHECK-NEXT: vcvtps2uqq %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x79,0xc0]			; CHECK-NEXT: vcvtps2uqq %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x79,0xc0]
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.cvtps2uqq.256(<4 x float> %x0, <4 x i64> %x1, i8 %x2)			%res = call <4 x i64> @llvm.x86.avx512.mask.cvtps2uqq.256(<4 x float> %x0, <4 x i64> %x1, i8 %x2)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvtps2uqq.256(<4 x float> %x0, <4 x i64> %x1, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvtps2uqq.256(<4 x float> %x0, <4 x i64> %x1, i8 -1)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	ret <4 x i64> %res2			ret <4 x i64> %res2
	}			}

	declare <2 x double> @llvm.x86.avx512.mask.cvtqq2pd.128(<2 x i64>, <2 x double>, i8)			declare <2 x double> @llvm.x86.avx512.mask.cvtqq2pd.128(<2 x i64>, <2 x double>, i8)

	define <2 x double>@test_int_x86_avx512_mask_cvt_qq2pd_128(<2 x i64> %x0, <2 x double> %x1, i8 %x2) {			define <2 x double>@test_int_x86_avx512_mask_cvt_qq2pd_128(<2 x i64> %x0, <2 x double> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2pd_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2pd_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtqq2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfe,0x09,0xe6,0xc8]			; CHECK-NEXT: vcvtqq2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfe,0x09,0xe6,0xc8]
	; CHECK-NEXT: vcvtqq2pd %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfe,0x08,0xe6,0xc0]			; CHECK-NEXT: vcvtqq2pd %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfe,0x08,0xe6,0xc0]
	; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]			; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.avx512.mask.cvtqq2pd.128(<2 x i64> %x0, <2 x double> %x1, i8 %x2)			%res = call <2 x double> @llvm.x86.avx512.mask.cvtqq2pd.128(<2 x i64> %x0, <2 x double> %x1, i8 %x2)
	%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtqq2pd.128(<2 x i64> %x0, <2 x double> %x1, i8 -1)			%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtqq2pd.128(<2 x i64> %x0, <2 x double> %x1, i8 -1)
	%res2 = fadd <2 x double> %res, %res1			%res2 = fadd <2 x double> %res, %res1
	ret <2 x double> %res2			ret <2 x double> %res2
	}			}

	declare <4 x double> @llvm.x86.avx512.mask.cvtqq2pd.256(<4 x i64>, <4 x double>, i8)			declare <4 x double> @llvm.x86.avx512.mask.cvtqq2pd.256(<4 x i64>, <4 x double>, i8)

	define <4 x double>@test_int_x86_avx512_mask_cvt_qq2pd_256(<4 x i64> %x0, <4 x double> %x1, i8 %x2) {			define <4 x double>@test_int_x86_avx512_mask_cvt_qq2pd_256(<4 x i64> %x0, <4 x double> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2pd_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2pd_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtqq2pd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfe,0x29,0xe6,0xc8]			; CHECK-NEXT: vcvtqq2pd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfe,0x29,0xe6,0xc8]
	; CHECK-NEXT: vcvtqq2pd %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfe,0x28,0xe6,0xc0]			; CHECK-NEXT: vcvtqq2pd %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfe,0x28,0xe6,0xc0]
	; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]			; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx512.mask.cvtqq2pd.256(<4 x i64> %x0, <4 x double> %x1, i8 %x2)			%res = call <4 x double> @llvm.x86.avx512.mask.cvtqq2pd.256(<4 x i64> %x0, <4 x double> %x1, i8 %x2)
	%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtqq2pd.256(<4 x i64> %x0, <4 x double> %x1, i8 -1)			%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtqq2pd.256(<4 x i64> %x0, <4 x double> %x1, i8 -1)
	%res2 = fadd <4 x double> %res, %res1			%res2 = fadd <4 x double> %res, %res1
	ret <4 x double> %res2			ret <4 x double> %res2
	}			}

	declare <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64>, <4 x float>, i8)			declare <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64>, <4 x float>, i8)

	define <4 x float>@test_int_x86_avx512_mask_cvt_qq2ps_128(<2 x i64> %x0, <4 x float> %x1, i8 %x2) {			define <4 x float>@test_int_x86_avx512_mask_cvt_qq2ps_128(<2 x i64> %x0, <4 x float> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2ps_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2ps_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtqq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x5b,0xc8]			; CHECK-NEXT: vcvtqq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x5b,0xc8]
	; CHECK-NEXT: vcvtqq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x5b,0xc0]			; CHECK-NEXT: vcvtqq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x5b,0xc0]
	; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]			; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 %x2)			%res = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 %x2)
	%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 -1)			%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 -1)
	%res2 = fadd <4 x float> %res, %res1			%res2 = fadd <4 x float> %res, %res1
	ret <4 x float> %res2			ret <4 x float> %res2
	}			}

	define <4 x float>@test_int_x86_avx512_mask_cvt_qq2ps_128_zext(<2 x i64> %x0, <4 x float> %x1, i8 %x2) {			define <4 x float>@test_int_x86_avx512_mask_cvt_qq2ps_128_zext(<2 x i64> %x0, <4 x float> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2ps_128_zext:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2ps_128_zext:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtqq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x5b,0xc8]			; CHECK-NEXT: vcvtqq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x5b,0xc8]
	; CHECK-NEXT: vmovq %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xfe,0x08,0x7e,0xc9]			; CHECK-NEXT: vmovq %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7e,0xc9]
	; CHECK-NEXT: ## xmm1 = xmm1[0],zero			; CHECK-NEXT: ## xmm1 = xmm1[0],zero
	; CHECK-NEXT: vcvtqq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x5b,0xc0]			; CHECK-NEXT: vcvtqq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x5b,0xc0]
	; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]			; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 %x2)			%res = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 %x2)
	%res1 = shufflevector <4 x float> %res, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>			%res1 = shufflevector <4 x float> %res, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
	%res2 = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 -1)			%res2 = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 -1)
	%res3 = shufflevector <4 x float> %res2, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>			%res3 = shufflevector <4 x float> %res2, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
	%res4 = fadd <4 x float> %res1, %res3			%res4 = fadd <4 x float> %res1, %res3
	ret <4 x float> %res4			ret <4 x float> %res4
	}			}

	declare <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.256(<4 x i64>, <4 x float>, i8)			declare <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.256(<4 x i64>, <4 x float>, i8)

	define <4 x float>@test_int_x86_avx512_mask_cvt_qq2ps_256(<4 x i64> %x0, <4 x float> %x1, i8 %x2) {			define <4 x float>@test_int_x86_avx512_mask_cvt_qq2ps_256(<4 x i64> %x0, <4 x float> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2ps_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_qq2ps_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtqq2ps %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x29,0x5b,0xc8]			; CHECK-NEXT: vcvtqq2ps %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x29,0x5b,0xc8]
	; CHECK-NEXT: vcvtqq2ps %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x28,0x5b,0xc0]			; CHECK-NEXT: vcvtqq2ps %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x28,0x5b,0xc0]
	; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]			; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.256(<4 x i64> %x0, <4 x float> %x1, i8 %x2)			%res = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.256(<4 x i64> %x0, <4 x float> %x1, i8 %x2)
	%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.256(<4 x i64> %x0, <4 x float> %x1, i8 -1)			%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtqq2ps.256(<4 x i64> %x0, <4 x float> %x1, i8 -1)
	%res2 = fadd <4 x float> %res, %res1			%res2 = fadd <4 x float> %res, %res1
	ret <4 x float> %res2			ret <4 x float> %res2
	}			}

	declare <2 x i64> @llvm.x86.avx512.mask.cvttpd2qq.128(<2 x double>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.cvttpd2qq.128(<2 x double>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_cvtt_pd2qq_128(<2 x double> %x0, <2 x i64> %x1, i8 %x2) {			define <2 x i64>@test_int_x86_avx512_mask_cvtt_pd2qq_128(<2 x double> %x0, <2 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2qq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2qq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvttpd2qq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x7a,0xc8]			; CHECK-NEXT: vcvttpd2qq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x7a,0xc8]
	; CHECK-NEXT: vcvttpd2qq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x7a,0xc0]			; CHECK-NEXT: vcvttpd2qq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x7a,0xc0]
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx512.mask.cvttpd2qq.128(<2 x double> %x0, <2 x i64> %x1, i8 %x2)			%res = call <2 x i64> @llvm.x86.avx512.mask.cvttpd2qq.128(<2 x double> %x0, <2 x i64> %x1, i8 %x2)
	%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvttpd2qq.128(<2 x double> %x0, <2 x i64> %x1, i8 -1)			%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvttpd2qq.128(<2 x double> %x0, <2 x i64> %x1, i8 -1)
	%res2 = add <2 x i64> %res, %res1			%res2 = add <2 x i64> %res, %res1
	ret <2 x i64> %res2			ret <2 x i64> %res2
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.cvttpd2qq.256(<4 x double>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.cvttpd2qq.256(<4 x double>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_cvtt_pd2qq_256(<4 x double> %x0, <4 x i64> %x1, i8 %x2) {			define <4 x i64>@test_int_x86_avx512_mask_cvtt_pd2qq_256(<4 x double> %x0, <4 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2qq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2qq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvttpd2qq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x7a,0xc8]			; CHECK-NEXT: vcvttpd2qq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x7a,0xc8]
	; CHECK-NEXT: vcvttpd2qq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x7a,0xc0]			; CHECK-NEXT: vcvttpd2qq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x7a,0xc0]
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.cvttpd2qq.256(<4 x double> %x0, <4 x i64> %x1, i8 %x2)			%res = call <4 x i64> @llvm.x86.avx512.mask.cvttpd2qq.256(<4 x double> %x0, <4 x i64> %x1, i8 %x2)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvttpd2qq.256(<4 x double> %x0, <4 x i64> %x1, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvttpd2qq.256(<4 x double> %x0, <4 x i64> %x1, i8 -1)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	ret <4 x i64> %res2			ret <4 x i64> %res2
	}			}

	declare <2 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.128(<2 x double>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.128(<2 x double>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_cvtt_pd2uqq_128(<2 x double> %x0, <2 x i64> %x1, i8 %x2) {			define <2 x i64>@test_int_x86_avx512_mask_cvtt_pd2uqq_128(<2 x double> %x0, <2 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2uqq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2uqq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvttpd2uqq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x78,0xc8]			; CHECK-NEXT: vcvttpd2uqq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x78,0xc8]
	; CHECK-NEXT: vcvttpd2uqq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x78,0xc0]			; CHECK-NEXT: vcvttpd2uqq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x78,0xc0]
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.128(<2 x double> %x0, <2 x i64> %x1, i8 %x2)			%res = call <2 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.128(<2 x double> %x0, <2 x i64> %x1, i8 %x2)
	%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.128(<2 x double> %x0, <2 x i64> %x1, i8 -1)			%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.128(<2 x double> %x0, <2 x i64> %x1, i8 -1)
	%res2 = add <2 x i64> %res, %res1			%res2 = add <2 x i64> %res, %res1
	ret <2 x i64> %res2			ret <2 x i64> %res2
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.256(<4 x double>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.256(<4 x double>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_cvtt_pd2uqq_256(<4 x double> %x0, <4 x i64> %x1, i8 %x2) {			define <4 x i64>@test_int_x86_avx512_mask_cvtt_pd2uqq_256(<4 x double> %x0, <4 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2uqq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2uqq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvttpd2uqq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x78,0xc8]			; CHECK-NEXT: vcvttpd2uqq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x78,0xc8]
	; CHECK-NEXT: vcvttpd2uqq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x78,0xc0]			; CHECK-NEXT: vcvttpd2uqq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x78,0xc0]
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.256(<4 x double> %x0, <4 x i64> %x1, i8 %x2)			%res = call <4 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.256(<4 x double> %x0, <4 x i64> %x1, i8 %x2)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.256(<4 x double> %x0, <4 x i64> %x1, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvttpd2uqq.256(<4 x double> %x0, <4 x i64> %x1, i8 -1)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	ret <4 x i64> %res2			ret <4 x i64> %res2
	}			}

	declare <2 x i64> @llvm.x86.avx512.mask.cvttps2qq.128(<4 x float>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.cvttps2qq.128(<4 x float>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_cvtt_ps2qq_128(<4 x float> %x0, <2 x i64> %x1, i8 %x2) {			define <2 x i64>@test_int_x86_avx512_mask_cvtt_ps2qq_128(<4 x float> %x0, <2 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2qq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2qq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvttps2qq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x7a,0xc8]			; CHECK-NEXT: vcvttps2qq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x7a,0xc8]
	; CHECK-NEXT: vcvttps2qq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x7a,0xc0]			; CHECK-NEXT: vcvttps2qq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x7a,0xc0]
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx512.mask.cvttps2qq.128(<4 x float> %x0, <2 x i64> %x1, i8 %x2)			%res = call <2 x i64> @llvm.x86.avx512.mask.cvttps2qq.128(<4 x float> %x0, <2 x i64> %x1, i8 %x2)
	%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvttps2qq.128(<4 x float> %x0, <2 x i64> %x1, i8 -1)			%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvttps2qq.128(<4 x float> %x0, <2 x i64> %x1, i8 -1)
	%res2 = add <2 x i64> %res, %res1			%res2 = add <2 x i64> %res, %res1
	ret <2 x i64> %res2			ret <2 x i64> %res2
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.cvttps2qq.256(<4 x float>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.cvttps2qq.256(<4 x float>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_cvtt_ps2qq_256(<4 x float> %x0, <4 x i64> %x1, i8 %x2) {			define <4 x i64>@test_int_x86_avx512_mask_cvtt_ps2qq_256(<4 x float> %x0, <4 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2qq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2qq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvttps2qq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x7a,0xc8]			; CHECK-NEXT: vcvttps2qq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x7a,0xc8]
	; CHECK-NEXT: vcvttps2qq %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x7a,0xc0]			; CHECK-NEXT: vcvttps2qq %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x7a,0xc0]
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.cvttps2qq.256(<4 x float> %x0, <4 x i64> %x1, i8 %x2)			%res = call <4 x i64> @llvm.x86.avx512.mask.cvttps2qq.256(<4 x float> %x0, <4 x i64> %x1, i8 %x2)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvttps2qq.256(<4 x float> %x0, <4 x i64> %x1, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvttps2qq.256(<4 x float> %x0, <4 x i64> %x1, i8 -1)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	ret <4 x i64> %res2			ret <4 x i64> %res2
	}			}

	declare <2 x double> @llvm.x86.avx512.mask.cvtuqq2pd.128(<2 x i64>, <2 x double>, i8)			declare <2 x double> @llvm.x86.avx512.mask.cvtuqq2pd.128(<2 x i64>, <2 x double>, i8)

	define <2 x double>@test_int_x86_avx512_mask_cvt_uqq2pd_128(<2 x i64> %x0, <2 x double> %x1, i8 %x2) {			define <2 x double>@test_int_x86_avx512_mask_cvt_uqq2pd_128(<2 x i64> %x0, <2 x double> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2pd_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2pd_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtuqq2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfe,0x09,0x7a,0xc8]			; CHECK-NEXT: vcvtuqq2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfe,0x09,0x7a,0xc8]
	; CHECK-NEXT: vcvtuqq2pd %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfe,0x08,0x7a,0xc0]			; CHECK-NEXT: vcvtuqq2pd %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfe,0x08,0x7a,0xc0]
	; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]			; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.avx512.mask.cvtuqq2pd.128(<2 x i64> %x0, <2 x double> %x1, i8 %x2)			%res = call <2 x double> @llvm.x86.avx512.mask.cvtuqq2pd.128(<2 x i64> %x0, <2 x double> %x1, i8 %x2)
	%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtuqq2pd.128(<2 x i64> %x0, <2 x double> %x1, i8 -1)			%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtuqq2pd.128(<2 x i64> %x0, <2 x double> %x1, i8 -1)
	%res2 = fadd <2 x double> %res, %res1			%res2 = fadd <2 x double> %res, %res1
	ret <2 x double> %res2			ret <2 x double> %res2
	}			}

	declare <4 x double> @llvm.x86.avx512.mask.cvtuqq2pd.256(<4 x i64>, <4 x double>, i8)			declare <4 x double> @llvm.x86.avx512.mask.cvtuqq2pd.256(<4 x i64>, <4 x double>, i8)

	define <4 x double>@test_int_x86_avx512_mask_cvt_uqq2pd_256(<4 x i64> %x0, <4 x double> %x1, i8 %x2) {			define <4 x double>@test_int_x86_avx512_mask_cvt_uqq2pd_256(<4 x i64> %x0, <4 x double> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2pd_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2pd_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtuqq2pd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfe,0x29,0x7a,0xc8]			; CHECK-NEXT: vcvtuqq2pd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfe,0x29,0x7a,0xc8]
	; CHECK-NEXT: vcvtuqq2pd %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfe,0x28,0x7a,0xc0]			; CHECK-NEXT: vcvtuqq2pd %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfe,0x28,0x7a,0xc0]
	; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]			; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx512.mask.cvtuqq2pd.256(<4 x i64> %x0, <4 x double> %x1, i8 %x2)			%res = call <4 x double> @llvm.x86.avx512.mask.cvtuqq2pd.256(<4 x i64> %x0, <4 x double> %x1, i8 %x2)
	%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtuqq2pd.256(<4 x i64> %x0, <4 x double> %x1, i8 -1)			%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtuqq2pd.256(<4 x i64> %x0, <4 x double> %x1, i8 -1)
	%res2 = fadd <4 x double> %res, %res1			%res2 = fadd <4 x double> %res, %res1
	ret <4 x double> %res2			ret <4 x double> %res2
	}			}

	declare <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64>, <4 x float>, i8)			declare <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64>, <4 x float>, i8)

	define <4 x float>@test_int_x86_avx512_mask_cvt_uqq2ps_128(<2 x i64> %x0, <4 x float> %x1, i8 %x2) {			define <4 x float>@test_int_x86_avx512_mask_cvt_uqq2ps_128(<2 x i64> %x0, <4 x float> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2ps_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2ps_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtuqq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x7a,0xc8]			; CHECK-NEXT: vcvtuqq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x7a,0xc8]
	; CHECK-NEXT: vcvtuqq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0x7a,0xc0]			; CHECK-NEXT: vcvtuqq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0x7a,0xc0]
	; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]			; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 %x2)			%res = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 %x2)
	%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 -1)			%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 -1)
	%res2 = fadd <4 x float> %res, %res1			%res2 = fadd <4 x float> %res, %res1
	ret <4 x float> %res2			ret <4 x float> %res2
	}			}

	define <4 x float>@test_int_x86_avx512_mask_cvt_uqq2ps_128_zext(<2 x i64> %x0, <4 x float> %x1, i8 %x2) {			define <4 x float>@test_int_x86_avx512_mask_cvt_uqq2ps_128_zext(<2 x i64> %x0, <4 x float> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2ps_128_zext:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2ps_128_zext:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtuqq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x7a,0xc8]			; CHECK-NEXT: vcvtuqq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x7a,0xc8]
	; CHECK-NEXT: vmovq %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xfe,0x08,0x7e,0xc9]			; CHECK-NEXT: vmovq %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7e,0xc9]
	; CHECK-NEXT: ## xmm1 = xmm1[0],zero			; CHECK-NEXT: ## xmm1 = xmm1[0],zero
	; CHECK-NEXT: vcvtuqq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0x7a,0xc0]			; CHECK-NEXT: vcvtuqq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0x7a,0xc0]
	; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]			; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 %x2)			%res = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 %x2)
	%res1 = shufflevector <4 x float> %res, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>			%res1 = shufflevector <4 x float> %res, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
	%res2 = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 -1)			%res2 = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.128(<2 x i64> %x0, <4 x float> %x1, i8 -1)
	%res3 = shufflevector <4 x float> %res2, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>			%res3 = shufflevector <4 x float> %res2, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
	%res4 = fadd <4 x float> %res1, %res3			%res4 = fadd <4 x float> %res1, %res3
	ret <4 x float> %res4			ret <4 x float> %res4
	}			}

	declare <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.256(<4 x i64>, <4 x float>, i8)			declare <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.256(<4 x i64>, <4 x float>, i8)

	define <4 x float>@test_int_x86_avx512_mask_cvt_uqq2ps_256(<4 x i64> %x0, <4 x float> %x1, i8 %x2) {			define <4 x float>@test_int_x86_avx512_mask_cvt_uqq2ps_256(<4 x i64> %x0, <4 x float> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2ps_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvt_uqq2ps_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvtuqq2ps %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0x7a,0xc8]			; CHECK-NEXT: vcvtuqq2ps %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0x7a,0xc8]
	; CHECK-NEXT: vcvtuqq2ps %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x28,0x7a,0xc0]			; CHECK-NEXT: vcvtuqq2ps %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x28,0x7a,0xc0]
	; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]			; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.256(<4 x i64> %x0, <4 x float> %x1, i8 %x2)			%res = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.256(<4 x i64> %x0, <4 x float> %x1, i8 %x2)
	%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.256(<4 x i64> %x0, <4 x float> %x1, i8 -1)			%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtuqq2ps.256(<4 x i64> %x0, <4 x float> %x1, i8 -1)
	%res2 = fadd <4 x float> %res, %res1			%res2 = fadd <4 x float> %res, %res1
	ret <4 x float> %res2			ret <4 x float> %res2
	}			}

	declare <2 x i64> @llvm.x86.avx512.mask.cvttps2uqq.128(<4 x float>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.cvttps2uqq.128(<4 x float>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_cvtt_ps2uqq_128(<4 x float> %x0, <2 x i64> %x1, i8 %x2) {			define <2 x i64>@test_int_x86_avx512_mask_cvtt_ps2uqq_128(<4 x float> %x0, <2 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2uqq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2uqq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvttps2uqq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x78,0xc8]			; CHECK-NEXT: vcvttps2uqq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x78,0xc8]
	; CHECK-NEXT: vcvttps2uqq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x78,0xc0]			; CHECK-NEXT: vcvttps2uqq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x78,0xc0]
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.avx512.mask.cvttps2uqq.128(<4 x float> %x0, <2 x i64> %x1, i8 %x2)			%res = call <2 x i64> @llvm.x86.avx512.mask.cvttps2uqq.128(<4 x float> %x0, <2 x i64> %x1, i8 %x2)
	%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvttps2uqq.128(<4 x float> %x0, <2 x i64> %x1, i8 -1)			%res1 = call <2 x i64> @llvm.x86.avx512.mask.cvttps2uqq.128(<4 x float> %x0, <2 x i64> %x1, i8 -1)
	%res2 = add <2 x i64> %res, %res1			%res2 = add <2 x i64> %res, %res1
	ret <2 x i64> %res2			ret <2 x i64> %res2
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.cvttps2uqq.256(<4 x float>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.cvttps2uqq.256(<4 x float>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_cvtt_ps2uqq_256(<4 x float> %x0, <4 x i64> %x1, i8 %x2) {			define <4 x i64>@test_int_x86_avx512_mask_cvtt_ps2uqq_256(<4 x float> %x0, <4 x i64> %x1, i8 %x2) {
	; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2uqq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2uqq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vcvttps2uqq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x78,0xc8]			; CHECK-NEXT: vcvttps2uqq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x78,0xc8]
	; CHECK-NEXT: vcvttps2uqq %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x78,0xc0]			; CHECK-NEXT: vcvttps2uqq %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x78,0xc0]
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.cvttps2uqq.256(<4 x float> %x0, <4 x i64> %x1, i8 %x2)			%res = call <4 x i64> @llvm.x86.avx512.mask.cvttps2uqq.256(<4 x float> %x0, <4 x i64> %x1, i8 %x2)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvttps2uqq.256(<4 x float> %x0, <4 x i64> %x1, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.cvttps2uqq.256(<4 x float> %x0, <4 x i64> %x1, i8 -1)
	%res2 = add <4 x i64> %res, %res1			%res2 = add <4 x i64> %res, %res1
	ret <4 x i64> %res2			ret <4 x i64> %res2
	}			}

	declare <2 x double> @llvm.x86.avx512.mask.reduce.pd.128(<2 x double>, i32, <2 x double>, i8)			declare <2 x double> @llvm.x86.avx512.mask.reduce.pd.128(<2 x double>, i32, <2 x double>, i8)

	define <2 x double>@test_int_x86_avx512_mask_reduce_pd_128(<2 x double> %x0, <2 x double> %x2, i8 %x3) {			define <2 x double>@test_int_x86_avx512_mask_reduce_pd_128(<2 x double> %x0, <2 x double> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_reduce_pd_128:			; CHECK-LABEL: test_int_x86_avx512_mask_reduce_pd_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vreducepd $4, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x56,0xc8,0x04]			; CHECK-NEXT: vreducepd $4, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x56,0xc8,0x04]
	; CHECK-NEXT: vreducepd $8, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x08,0x56,0xc0,0x08]			; CHECK-NEXT: vreducepd $8, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x08,0x56,0xc0,0x08]
	; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]			; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.avx512.mask.reduce.pd.128(<2 x double> %x0, i32 4, <2 x double> %x2, i8 %x3)			%res = call <2 x double> @llvm.x86.avx512.mask.reduce.pd.128(<2 x double> %x0, i32 4, <2 x double> %x2, i8 %x3)
	%res1 = call <2 x double> @llvm.x86.avx512.mask.reduce.pd.128(<2 x double> %x0, i32 8, <2 x double> %x2, i8 -1)			%res1 = call <2 x double> @llvm.x86.avx512.mask.reduce.pd.128(<2 x double> %x0, i32 8, <2 x double> %x2, i8 -1)
	%res2 = fadd <2 x double> %res, %res1			%res2 = fadd <2 x double> %res, %res1
	ret <2 x double> %res2			ret <2 x double> %res2
	}			}

	declare <4 x double> @llvm.x86.avx512.mask.reduce.pd.256(<4 x double>, i32, <4 x double>, i8)			declare <4 x double> @llvm.x86.avx512.mask.reduce.pd.256(<4 x double>, i32, <4 x double>, i8)

	define <4 x double>@test_int_x86_avx512_mask_reduce_pd_256(<4 x double> %x0, <4 x double> %x2, i8 %x3) {			define <4 x double>@test_int_x86_avx512_mask_reduce_pd_256(<4 x double> %x0, <4 x double> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_reduce_pd_256:			; CHECK-LABEL: test_int_x86_avx512_mask_reduce_pd_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vreducepd $4, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x56,0xc8,0x04]			; CHECK-NEXT: vreducepd $4, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x56,0xc8,0x04]
	; CHECK-NEXT: vreducepd $0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x56,0xc0,0x00]			; CHECK-NEXT: vreducepd $0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x56,0xc0,0x00]
	; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]			; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx512.mask.reduce.pd.256(<4 x double> %x0, i32 4, <4 x double> %x2, i8 %x3)			%res = call <4 x double> @llvm.x86.avx512.mask.reduce.pd.256(<4 x double> %x0, i32 4, <4 x double> %x2, i8 %x3)
	%res1 = call <4 x double> @llvm.x86.avx512.mask.reduce.pd.256(<4 x double> %x0, i32 0, <4 x double> %x2, i8 -1)			%res1 = call <4 x double> @llvm.x86.avx512.mask.reduce.pd.256(<4 x double> %x0, i32 0, <4 x double> %x2, i8 -1)
	%res2 = fadd <4 x double> %res, %res1			%res2 = fadd <4 x double> %res, %res1
	ret <4 x double> %res2			ret <4 x double> %res2
	}			}

	declare <4 x float> @llvm.x86.avx512.mask.reduce.ps.128(<4 x float>, i32, <4 x float>, i8)			declare <4 x float> @llvm.x86.avx512.mask.reduce.ps.128(<4 x float>, i32, <4 x float>, i8)

	define <4 x float>@test_int_x86_avx512_mask_reduce_ps_128(<4 x float> %x0, <4 x float> %x2, i8 %x3) {			define <4 x float>@test_int_x86_avx512_mask_reduce_ps_128(<4 x float> %x0, <4 x float> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_reduce_ps_128:			; CHECK-LABEL: test_int_x86_avx512_mask_reduce_ps_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vreduceps $4, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x56,0xc8,0x04]			; CHECK-NEXT: vreduceps $4, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x56,0xc8,0x04]
	; CHECK-NEXT: vreduceps $88, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x56,0xc0,0x58]			; CHECK-NEXT: vreduceps $88, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x56,0xc0,0x58]
	; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]			; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx512.mask.reduce.ps.128(<4 x float> %x0, i32 4, <4 x float> %x2, i8 %x3)			%res = call <4 x float> @llvm.x86.avx512.mask.reduce.ps.128(<4 x float> %x0, i32 4, <4 x float> %x2, i8 %x3)
	%res1 = call <4 x float> @llvm.x86.avx512.mask.reduce.ps.128(<4 x float> %x0, i32 88, <4 x float> %x2, i8 -1)			%res1 = call <4 x float> @llvm.x86.avx512.mask.reduce.ps.128(<4 x float> %x0, i32 88, <4 x float> %x2, i8 -1)
	%res2 = fadd <4 x float> %res, %res1			%res2 = fadd <4 x float> %res, %res1
	ret <4 x float> %res2			ret <4 x float> %res2
	}			}

	declare <8 x float> @llvm.x86.avx512.mask.reduce.ps.256(<8 x float>, i32, <8 x float>, i8)			declare <8 x float> @llvm.x86.avx512.mask.reduce.ps.256(<8 x float>, i32, <8 x float>, i8)

	define <8 x float>@test_int_x86_avx512_mask_reduce_ps_256(<8 x float> %x0, <8 x float> %x2, i8 %x3) {			define <8 x float>@test_int_x86_avx512_mask_reduce_ps_256(<8 x float> %x0, <8 x float> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_reduce_ps_256:			; CHECK-LABEL: test_int_x86_avx512_mask_reduce_ps_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vreduceps $11, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x56,0xc8,0x0b]			; CHECK-NEXT: vreduceps $11, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x56,0xc8,0x0b]
	; CHECK-NEXT: vreduceps $11, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x56,0xc0,0x0b]			; CHECK-NEXT: vreduceps $11, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x56,0xc0,0x0b]
	; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]			; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <8 x float> @llvm.x86.avx512.mask.reduce.ps.256(<8 x float> %x0, i32 11, <8 x float> %x2, i8 %x3)			%res = call <8 x float> @llvm.x86.avx512.mask.reduce.ps.256(<8 x float> %x0, i32 11, <8 x float> %x2, i8 %x3)
	%res1 = call <8 x float> @llvm.x86.avx512.mask.reduce.ps.256(<8 x float> %x0, i32 11, <8 x float> %x2, i8 -1)			%res1 = call <8 x float> @llvm.x86.avx512.mask.reduce.ps.256(<8 x float> %x0, i32 11, <8 x float> %x2, i8 -1)
	%res2 = fadd <8 x float> %res, %res1			%res2 = fadd <8 x float> %res, %res1
	ret <8 x float> %res2			ret <8 x float> %res2
	}			}

	declare <2 x double> @llvm.x86.avx512.mask.range.pd.128(<2 x double>, <2 x double>, i32, <2 x double>, i8)			declare <2 x double> @llvm.x86.avx512.mask.range.pd.128(<2 x double>, <2 x double>, i32, <2 x double>, i8)

	define <2 x double>@test_int_x86_avx512_mask_range_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x3, i8 %x4) {			define <2 x double>@test_int_x86_avx512_mask_range_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x3, i8 %x4) {
	; CHECK-LABEL: test_int_x86_avx512_mask_range_pd_128:			; CHECK-LABEL: test_int_x86_avx512_mask_range_pd_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vrangepd $4, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x50,0xd1,0x04]			; CHECK-NEXT: vrangepd $4, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x50,0xd1,0x04]
	; CHECK-NEXT: vrangepd $8, %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x08,0x50,0xc1,0x08]			; CHECK-NEXT: vrangepd $8, %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x08,0x50,0xc1,0x08]
	; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0x58,0xc0]			; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.avx512.mask.range.pd.128(<2 x double> %x0, <2 x double> %x1, i32 4, <2 x double> %x3, i8 %x4)			%res = call <2 x double> @llvm.x86.avx512.mask.range.pd.128(<2 x double> %x0, <2 x double> %x1, i32 4, <2 x double> %x3, i8 %x4)
	%res1 = call <2 x double> @llvm.x86.avx512.mask.range.pd.128(<2 x double> %x0, <2 x double> %x1, i32 8, <2 x double> %x3, i8 -1)			%res1 = call <2 x double> @llvm.x86.avx512.mask.range.pd.128(<2 x double> %x0, <2 x double> %x1, i32 8, <2 x double> %x3, i8 -1)
	%res2 = fadd <2 x double> %res, %res1			%res2 = fadd <2 x double> %res, %res1
	ret <2 x double> %res2			ret <2 x double> %res2
	}			}

	declare <4 x double> @llvm.x86.avx512.mask.range.pd.256(<4 x double>, <4 x double>, i32, <4 x double>, i8)			declare <4 x double> @llvm.x86.avx512.mask.range.pd.256(<4 x double>, <4 x double>, i32, <4 x double>, i8)

	define <4 x double>@test_int_x86_avx512_mask_range_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x3, i8 %x4) {			define <4 x double>@test_int_x86_avx512_mask_range_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x3, i8 %x4) {
	; CHECK-LABEL: test_int_x86_avx512_mask_range_pd_256:			; CHECK-LABEL: test_int_x86_avx512_mask_range_pd_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vrangepd $4, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x50,0xd1,0x04]			; CHECK-NEXT: vrangepd $4, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x50,0xd1,0x04]
	; CHECK-NEXT: vrangepd $88, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x50,0xc1,0x58]			; CHECK-NEXT: vrangepd $88, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x50,0xc1,0x58]
	; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc0]			; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx512.mask.range.pd.256(<4 x double> %x0, <4 x double> %x1, i32 4, <4 x double> %x3, i8 %x4)			%res = call <4 x double> @llvm.x86.avx512.mask.range.pd.256(<4 x double> %x0, <4 x double> %x1, i32 4, <4 x double> %x3, i8 %x4)
	%res1 = call <4 x double> @llvm.x86.avx512.mask.range.pd.256(<4 x double> %x0, <4 x double> %x1, i32 88, <4 x double> %x3, i8 -1)			%res1 = call <4 x double> @llvm.x86.avx512.mask.range.pd.256(<4 x double> %x0, <4 x double> %x1, i32 88, <4 x double> %x3, i8 -1)
	%res2 = fadd <4 x double> %res, %res1			%res2 = fadd <4 x double> %res, %res1
	ret <4 x double> %res2			ret <4 x double> %res2
	}			}

	declare <4 x float> @llvm.x86.avx512.mask.range.ps.128(<4 x float>, <4 x float>, i32, <4 x float>, i8)			declare <4 x float> @llvm.x86.avx512.mask.range.ps.128(<4 x float>, <4 x float>, i32, <4 x float>, i8)

	define <4 x float>@test_int_x86_avx512_mask_range_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x3, i8 %x4) {			define <4 x float>@test_int_x86_avx512_mask_range_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x3, i8 %x4) {
	; CHECK-LABEL: test_int_x86_avx512_mask_range_ps_128:			; CHECK-LABEL: test_int_x86_avx512_mask_range_ps_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vrangeps $4, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x50,0xd1,0x04]			; CHECK-NEXT: vrangeps $4, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x50,0xd1,0x04]
	; CHECK-NEXT: vrangeps $88, %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x50,0xc1,0x58]			; CHECK-NEXT: vrangeps $88, %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x50,0xc1,0x58]
	; CHECK-NEXT: vaddps %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6c,0x08,0x58,0xc0]			; CHECK-NEXT: vaddps %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe8,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.avx512.mask.range.ps.128(<4 x float> %x0, <4 x float> %x1, i32 4, <4 x float> %x3, i8 %x4)			%res = call <4 x float> @llvm.x86.avx512.mask.range.ps.128(<4 x float> %x0, <4 x float> %x1, i32 4, <4 x float> %x3, i8 %x4)
	%res1 = call <4 x float> @llvm.x86.avx512.mask.range.ps.128(<4 x float> %x0, <4 x float> %x1, i32 88, <4 x float> %x3, i8 -1)			%res1 = call <4 x float> @llvm.x86.avx512.mask.range.ps.128(<4 x float> %x0, <4 x float> %x1, i32 88, <4 x float> %x3, i8 -1)
	%res2 = fadd <4 x float> %res, %res1			%res2 = fadd <4 x float> %res, %res1
	ret <4 x float> %res2			ret <4 x float> %res2
	}			}

	declare <8 x float> @llvm.x86.avx512.mask.range.ps.256(<8 x float>, <8 x float>, i32, <8 x float>, i8)			declare <8 x float> @llvm.x86.avx512.mask.range.ps.256(<8 x float>, <8 x float>, i32, <8 x float>, i8)

	define <8 x float>@test_int_x86_avx512_mask_range_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x3, i8 %x4) {			define <8 x float>@test_int_x86_avx512_mask_range_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x3, i8 %x4) {
	; CHECK-LABEL: test_int_x86_avx512_mask_range_ps_256:			; CHECK-LABEL: test_int_x86_avx512_mask_range_ps_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vrangeps $4, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x50,0xd1,0x04]			; CHECK-NEXT: vrangeps $4, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x50,0xd1,0x04]
	; CHECK-NEXT: vrangeps $88, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x50,0xc1,0x58]			; CHECK-NEXT: vrangeps $88, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x50,0xc1,0x58]
	; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc0]			; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <8 x float> @llvm.x86.avx512.mask.range.ps.256(<8 x float> %x0, <8 x float> %x1, i32 4, <8 x float> %x3, i8 %x4)			%res = call <8 x float> @llvm.x86.avx512.mask.range.ps.256(<8 x float> %x0, <8 x float> %x1, i32 4, <8 x float> %x3, i8 %x4)
	%res1 = call <8 x float> @llvm.x86.avx512.mask.range.ps.256(<8 x float> %x0, <8 x float> %x1, i32 88, <8 x float> %x3, i8 -1)			%res1 = call <8 x float> @llvm.x86.avx512.mask.range.ps.256(<8 x float> %x0, <8 x float> %x1, i32 88, <8 x float> %x3, i8 -1)
	%res2 = fadd <8 x float> %res, %res1			%res2 = fadd <8 x float> %res, %res1
	ret <8 x float> %res2			ret <8 x float> %res2
	}			}

	declare <2 x double> @llvm.x86.avx512.mask.vextractf64x2.256(<4 x double>, i32, <2 x double>, i8)			declare <2 x double> @llvm.x86.avx512.mask.vextractf64x2.256(<4 x double>, i32, <2 x double>, i8)

	define <2 x double>@test_int_x86_avx512_mask_vextractf64x2_256(<4 x double> %x0, <2 x double> %x2, i8 %x3) {			define <2 x double>@test_int_x86_avx512_mask_vextractf64x2_256(<4 x double> %x0, <2 x double> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vextractf64x2_256:			; CHECK-LABEL: test_int_x86_avx512_mask_vextractf64x2_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vextractf64x2 $1, %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x19,0xc1,0x01]			; CHECK-NEXT: vextractf64x2 $1, %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x19,0xc1,0x01]
	; CHECK-NEXT: vextractf64x2 $1, %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x19,0xc2,0x01]			; CHECK-NEXT: vextractf64x2 $1, %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x19,0xc2,0x01]
	; CHECK-NEXT: vextractf64x2 $1, %ymm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x19,0xc0,0x01]			; CHECK-NEXT: vextractf64x2 $1, %ymm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x19,0xc0,0x01]
	; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]			; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
	; CHECK-NEXT: vaddpd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x58,0xc2]			; CHECK-NEXT: vaddpd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x58,0xc2]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.avx512.mask.vextractf64x2.256(<4 x double> %x0,i32 1, <2 x double> %x2, i8 %x3)			%res = call <2 x double> @llvm.x86.avx512.mask.vextractf64x2.256(<4 x double> %x0,i32 1, <2 x double> %x2, i8 %x3)
	%res2 = call <2 x double> @llvm.x86.avx512.mask.vextractf64x2.256(<4 x double> %x0,i32 1, <2 x double> zeroinitializer, i8 %x3)			%res2 = call <2 x double> @llvm.x86.avx512.mask.vextractf64x2.256(<4 x double> %x0,i32 1, <2 x double> zeroinitializer, i8 %x3)
	%res1 = call <2 x double> @llvm.x86.avx512.mask.vextractf64x2.256(<4 x double> %x0,i32 1, <2 x double> zeroinitializer, i8 -1)			%res1 = call <2 x double> @llvm.x86.avx512.mask.vextractf64x2.256(<4 x double> %x0,i32 1, <2 x double> zeroinitializer, i8 -1)
	%res3 = fadd <2 x double> %res, %res1			%res3 = fadd <2 x double> %res, %res1
	%res4 = fadd <2 x double> %res3, %res2			%res4 = fadd <2 x double> %res3, %res2
	ret <2 x double> %res4			ret <2 x double> %res4
	}			}

	declare <4 x double> @llvm.x86.avx512.mask.insertf64x2.256(<4 x double>, <2 x double>, i32, <4 x double>, i8)			declare <4 x double> @llvm.x86.avx512.mask.insertf64x2.256(<4 x double>, <2 x double>, i32, <4 x double>, i8)

	define <4 x double>@test_int_x86_avx512_mask_insertf64x2_256(<4 x double> %x0, <2 x double> %x1, <4 x double> %x3, i8 %x4) {			define <4 x double>@test_int_x86_avx512_mask_insertf64x2_256(<4 x double> %x0, <2 x double> %x1, <4 x double> %x3, i8 %x4) {
	; CHECK-LABEL: test_int_x86_avx512_mask_insertf64x2_256:			; CHECK-LABEL: test_int_x86_avx512_mask_insertf64x2_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vinsertf64x2 $1, %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x18,0xd1,0x01]			; CHECK-NEXT: vinsertf64x2 $1, %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x18,0xd1,0x01]
	; CHECK-NEXT: vinsertf64x2 $1, %xmm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x18,0xd9,0x01]			; CHECK-NEXT: vinsertf64x2 $1, %xmm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x18,0xd9,0x01]
	; CHECK-NEXT: vinsertf64x2 $1, %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x18,0xc1,0x01]			; CHECK-NEXT: vinsertf64x2 $1, %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x18,0xc1,0x01]
	; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc0]			; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc0]
	; CHECK-NEXT: vaddpd %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc0]			; CHECK-NEXT: vaddpd %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x double> @llvm.x86.avx512.mask.insertf64x2.256(<4 x double> %x0, <2 x double> %x1, i32 1, <4 x double> %x3, i8 %x4)			%res = call <4 x double> @llvm.x86.avx512.mask.insertf64x2.256(<4 x double> %x0, <2 x double> %x1, i32 1, <4 x double> %x3, i8 %x4)
	%res1 = call <4 x double> @llvm.x86.avx512.mask.insertf64x2.256(<4 x double> %x0, <2 x double> %x1, i32 1, <4 x double> %x3, i8 -1)			%res1 = call <4 x double> @llvm.x86.avx512.mask.insertf64x2.256(<4 x double> %x0, <2 x double> %x1, i32 1, <4 x double> %x3, i8 -1)
	%res2 = call <4 x double> @llvm.x86.avx512.mask.insertf64x2.256(<4 x double> %x0, <2 x double> %x1, i32 1, <4 x double> zeroinitializer, i8 %x4)			%res2 = call <4 x double> @llvm.x86.avx512.mask.insertf64x2.256(<4 x double> %x0, <2 x double> %x1, i32 1, <4 x double> zeroinitializer, i8 %x4)
	%res3 = fadd <4 x double> %res, %res1			%res3 = fadd <4 x double> %res, %res1
	%res4 = fadd <4 x double> %res2, %res3			%res4 = fadd <4 x double> %res2, %res3
	ret <4 x double> %res4			ret <4 x double> %res4
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.inserti64x2.256(<4 x i64>, <2 x i64>, i32, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.inserti64x2.256(<4 x i64>, <2 x i64>, i32, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_inserti64x2_256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x3, i8 %x4) {			define <4 x i64>@test_int_x86_avx512_mask_inserti64x2_256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x3, i8 %x4) {
	; CHECK-LABEL: test_int_x86_avx512_mask_inserti64x2_256:			; CHECK-LABEL: test_int_x86_avx512_mask_inserti64x2_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vinserti64x2 $1, %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x38,0xd1,0x01]			; CHECK-NEXT: vinserti64x2 $1, %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x38,0xd1,0x01]
	; CHECK-NEXT: vinserti64x2 $1, %xmm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x38,0xd9,0x01]			; CHECK-NEXT: vinserti64x2 $1, %xmm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x38,0xd9,0x01]
	; CHECK-NEXT: vinserti64x2 $1, %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x38,0xc1,0x01]			; CHECK-NEXT: vinserti64x2 $1, %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x38,0xc1,0x01]
	; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
	; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc3]			; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc3]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i64> @llvm.x86.avx512.mask.inserti64x2.256(<4 x i64> %x0, <2 x i64> %x1, i32 1, <4 x i64> %x3, i8 %x4)			%res = call <4 x i64> @llvm.x86.avx512.mask.inserti64x2.256(<4 x i64> %x0, <2 x i64> %x1, i32 1, <4 x i64> %x3, i8 %x4)
	%res1 = call <4 x i64> @llvm.x86.avx512.mask.inserti64x2.256(<4 x i64> %x0, <2 x i64> %x1, i32 1, <4 x i64> %x3, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.inserti64x2.256(<4 x i64> %x0, <2 x i64> %x1, i32 1, <4 x i64> %x3, i8 -1)
	%res2 = call <4 x i64> @llvm.x86.avx512.mask.inserti64x2.256(<4 x i64> %x0, <2 x i64> %x1, i32 1, <4 x i64> zeroinitializer, i8 %x4)			%res2 = call <4 x i64> @llvm.x86.avx512.mask.inserti64x2.256(<4 x i64> %x0, <2 x i64> %x1, i32 1, <4 x i64> zeroinitializer, i8 %x4)
	%res3 = add <4 x i64> %res, %res1			%res3 = add <4 x i64> %res, %res1
	%res4 = add <4 x i64> %res3, %res2			%res4 = add <4 x i64> %res3, %res2
	ret <4 x i64> %res4			ret <4 x i64> %res4
	}			}
	▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vbroadcastf32x2 %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x19,0xc8]			; CHECK-NEXT: vbroadcastf32x2 %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x19,0xc8]
	; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0,1,0,1,0,1,0,1]			; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0,1,0,1,0,1,0,1]
	; CHECK-NEXT: vbroadcastf32x2 %xmm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x19,0xd0]			; CHECK-NEXT: vbroadcastf32x2 %xmm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x19,0xd0]
	; CHECK-NEXT: ## ymm2 {%k1} {z} = xmm0[0,1,0,1,0,1,0,1]			; CHECK-NEXT: ## ymm2 {%k1} {z} = xmm0[0,1,0,1,0,1,0,1]
	; CHECK-NEXT: vbroadcastf32x2 %xmm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x19,0xc0]			; CHECK-NEXT: vbroadcastf32x2 %xmm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x19,0xc0]
	; CHECK-NEXT: ## ymm0 = xmm0[0,1,0,1,0,1,0,1]			; CHECK-NEXT: ## ymm0 = xmm0[0,1,0,1,0,1,0,1]
	; CHECK-NEXT: vaddps %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xca]			; CHECK-NEXT: vaddps %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xca]
	; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]			; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x2.256(<4 x float> %x0, <8 x float> %x2, i8 %x3)			%res = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x2.256(<4 x float> %x0, <8 x float> %x2, i8 %x3)
	%res1 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x2.256(<4 x float> %x0, <8 x float> zeroinitializer, i8 %x3)			%res1 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x2.256(<4 x float> %x0, <8 x float> zeroinitializer, i8 %x3)
	%res2 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x2.256(<4 x float> %x0, <8 x float> %x2, i8 -1)			%res2 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x2.256(<4 x float> %x0, <8 x float> %x2, i8 -1)
	%res3 = fadd <8 x float> %res, %res1			%res3 = fadd <8 x float> %res, %res1
	%res4 = fadd <8 x float> %res3, %res2			%res4 = fadd <8 x float> %res3, %res2
	ret <8 x float> %res4			ret <8 x float> %res4
	}			}

	declare <8 x i32> @llvm.x86.avx512.mask.broadcasti32x2.256(<4 x i32>, <8 x i32>, i8)			declare <8 x i32> @llvm.x86.avx512.mask.broadcasti32x2.256(<4 x i32>, <8 x i32>, i8)

	define <8 x i32>@test_int_x86_avx512_mask_broadcasti32x2_256(<4 x i32> %x0, <8 x i32> %x2, i8 %x3, i64 * %y_ptr) {			define <8 x i32>@test_int_x86_avx512_mask_broadcasti32x2_256(<4 x i32> %x0, <8 x i32> %x2, i8 %x3, i64 * %y_ptr) {
	; CHECK-LABEL: test_int_x86_avx512_mask_broadcasti32x2_256:			; CHECK-LABEL: test_int_x86_avx512_mask_broadcasti32x2_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vbroadcasti32x2 (%rsi), %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x59,0x0e]			; CHECK-NEXT: vbroadcasti32x2 (%rsi), %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x59,0x0e]
	; CHECK-NEXT: ## ymm1 {%k1} = mem[0,1,0,1,0,1,0,1]			; CHECK-NEXT: ## ymm1 {%k1} = mem[0,1,0,1,0,1,0,1]
	; CHECK-NEXT: vbroadcasti32x2 %xmm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x59,0xd0]			; CHECK-NEXT: vbroadcasti32x2 %xmm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x59,0xd0]
	; CHECK-NEXT: ## ymm2 {%k1} {z} = xmm0[0,1,0,1,0,1,0,1]			; CHECK-NEXT: ## ymm2 {%k1} {z} = xmm0[0,1,0,1,0,1,0,1]
	; CHECK-NEXT: vbroadcasti32x2 %xmm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x59,0xc0]			; CHECK-NEXT: vbroadcasti32x2 %xmm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x59,0xc0]
	; CHECK-NEXT: ## ymm0 = xmm0[0,1,0,1,0,1,0,1]			; CHECK-NEXT: ## ymm0 = xmm0[0,1,0,1,0,1,0,1]
	; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]			; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
	; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]			; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%y_64 = load i64, i64 * %y_ptr			%y_64 = load i64, i64 * %y_ptr
	%y_v2i64 = insertelement <2 x i64> undef, i64 %y_64, i32 0			%y_v2i64 = insertelement <2 x i64> undef, i64 %y_64, i32 0
	%y = bitcast <2 x i64> %y_v2i64 to <4 x i32>			%y = bitcast <2 x i64> %y_v2i64 to <4 x i32>
	%res = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x2.256(<4 x i32> %y, <8 x i32> %x2, i8 %x3)			%res = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x2.256(<4 x i32> %y, <8 x i32> %x2, i8 %x3)
	%res1 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x2.256(<4 x i32> %x0, <8 x i32> zeroinitializer, i8 %x3)			%res1 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x2.256(<4 x i32> %x0, <8 x i32> zeroinitializer, i8 %x3)
	%res2 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x2.256(<4 x i32> %x0, <8 x i32> %x2, i8 -1)			%res2 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x2.256(<4 x i32> %x0, <8 x i32> %x2, i8 -1)
	%res3 = add <8 x i32> %res, %res1			%res3 = add <8 x i32> %res, %res1
	%res4 = add <8 x i32> %res3, %res2			%res4 = add <8 x i32> %res3, %res2
	ret <8 x i32> %res4			ret <8 x i32> %res4
	}			}

	declare <4 x i32> @llvm.x86.avx512.mask.broadcasti32x2.128(<4 x i32>, <4 x i32>, i8)			declare <4 x i32> @llvm.x86.avx512.mask.broadcasti32x2.128(<4 x i32>, <4 x i32>, i8)

	define <4 x i32>@test_int_x86_avx512_mask_broadcasti32x2_128(<4 x i32> %x0, <4 x i32> %x2, i8 %x3) {			define <4 x i32>@test_int_x86_avx512_mask_broadcasti32x2_128(<4 x i32> %x0, <4 x i32> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_broadcasti32x2_128:			; CHECK-LABEL: test_int_x86_avx512_mask_broadcasti32x2_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vbroadcasti32x2 %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x59,0xc8]			; CHECK-NEXT: vbroadcasti32x2 %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x59,0xc8]
	; CHECK-NEXT: vbroadcasti32x2 %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x59,0xd0]			; CHECK-NEXT: vbroadcasti32x2 %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x59,0xd0]
	; CHECK-NEXT: vbroadcasti32x2 %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x59,0xc0]			; CHECK-NEXT: vbroadcasti32x2 %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x59,0xc0]
	; CHECK-NEXT: vpaddd %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xca]			; CHECK-NEXT: vpaddd %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xca]
	; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]			; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.avx512.mask.broadcasti32x2.128(<4 x i32> %x0, <4 x i32> %x2, i8 %x3)			%res = call <4 x i32> @llvm.x86.avx512.mask.broadcasti32x2.128(<4 x i32> %x0, <4 x i32> %x2, i8 %x3)
	%res1 = call <4 x i32> @llvm.x86.avx512.mask.broadcasti32x2.128(<4 x i32> %x0, <4 x i32> zeroinitializer, i8 %x3)			%res1 = call <4 x i32> @llvm.x86.avx512.mask.broadcasti32x2.128(<4 x i32> %x0, <4 x i32> zeroinitializer, i8 %x3)
	%res2 = call <4 x i32> @llvm.x86.avx512.mask.broadcasti32x2.128(<4 x i32> %x0, <4 x i32> %x2, i8 -1)			%res2 = call <4 x i32> @llvm.x86.avx512.mask.broadcasti32x2.128(<4 x i32> %x0, <4 x i32> %x2, i8 -1)
	%res3 = add <4 x i32> %res, %res1			%res3 = add <4 x i32> %res, %res1
	%res4 = add <4 x i32> %res3, %res2			%res4 = add <4 x i32> %res3, %res2
	ret <4 x i32> %res4			ret <4 x i32> %res4
	}			}
	▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ## kill: %XMM0<def> %XMM0<kill> %YMM0<def>			; CHECK-NEXT: ## kill: %XMM0<def> %XMM0<kill> %YMM0<def>
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vshuff64x2 $0, %ymm0, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x23,0xd0,0x00]			; CHECK-NEXT: vshuff64x2 $0, %ymm0, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x23,0xd0,0x00]
	; CHECK-NEXT: ## ymm2 {%k1} {z} = ymm0[0,1,0,1]			; CHECK-NEXT: ## ymm2 {%k1} {z} = ymm0[0,1,0,1]
	; CHECK-NEXT: vshuff64x2 $0, %ymm0, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x23,0xc8,0x00]			; CHECK-NEXT: vshuff64x2 $0, %ymm0, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x23,0xc8,0x00]
	; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,0,1]			; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,0,1]
	; CHECK-NEXT: vshuff64x2 $0, %ymm0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x23,0xc0,0x00]			; CHECK-NEXT: vshuff64x2 $0, %ymm0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x23,0xc0,0x00]
	; CHECK-NEXT: ## ymm0 = ymm0[0,1,0,1]			; CHECK-NEXT: ## ymm0 = ymm0[0,1,0,1]
	; CHECK-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x58,0xc1]			; CHECK-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x58,0xc1]
	; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc0]			; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]

	%res1 = call <4 x double> @llvm.x86.avx512.mask.broadcastf64x2.256(<2 x double> %x0, <4 x double> %x2, i8 -1)			%res1 = call <4 x double> @llvm.x86.avx512.mask.broadcastf64x2.256(<2 x double> %x0, <4 x double> %x2, i8 -1)
	%res2 = call <4 x double> @llvm.x86.avx512.mask.broadcastf64x2.256(<2 x double> %x0, <4 x double> %x2, i8 %mask)			%res2 = call <4 x double> @llvm.x86.avx512.mask.broadcastf64x2.256(<2 x double> %x0, <4 x double> %x2, i8 %mask)
	%res3 = call <4 x double> @llvm.x86.avx512.mask.broadcastf64x2.256(<2 x double> %x0, <4 x double> zeroinitializer, i8 %mask)			%res3 = call <4 x double> @llvm.x86.avx512.mask.broadcastf64x2.256(<2 x double> %x0, <4 x double> zeroinitializer, i8 %mask)
	%res4 = fadd <4 x double> %res1, %res2			%res4 = fadd <4 x double> %res1, %res2
	%res5 = fadd <4 x double> %res3, %res4			%res5 = fadd <4 x double> %res3, %res4
	ret <4 x double> %res5			ret <4 x double> %res5
	}			}

	declare <4 x i64> @llvm.x86.avx512.mask.broadcasti64x2.256(<2 x i64>, <4 x i64>, i8)			declare <4 x i64> @llvm.x86.avx512.mask.broadcasti64x2.256(<2 x i64>, <4 x i64>, i8)

	define <4 x i64>@test_int_x86_avx512_mask_broadcasti64x2_256(<2 x i64> %x0, <4 x i64> %x2, i8 %mask) {			define <4 x i64>@test_int_x86_avx512_mask_broadcasti64x2_256(<2 x i64> %x0, <4 x i64> %x2, i8 %mask) {
	; CHECK-LABEL: test_int_x86_avx512_mask_broadcasti64x2_256:			; CHECK-LABEL: test_int_x86_avx512_mask_broadcasti64x2_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: ## kill: %XMM0<def> %XMM0<kill> %YMM0<def>			; CHECK-NEXT: ## kill: %XMM0<def> %XMM0<kill> %YMM0<def>
	; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]			; CHECK-NEXT: kmovb %edi, %k1 ## encoding: [0xc5,0xf9,0x92,0xcf]
	; CHECK-NEXT: vshufi64x2 $0, %ymm0, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x43,0xd0,0x00]			; CHECK-NEXT: vshufi64x2 $0, %ymm0, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x43,0xd0,0x00]
	; CHECK-NEXT: ## ymm2 {%k1} {z} = ymm0[0,1,0,1]			; CHECK-NEXT: ## ymm2 {%k1} {z} = ymm0[0,1,0,1]
	; CHECK-NEXT: vshufi64x2 $0, %ymm0, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x43,0xc8,0x00]			; CHECK-NEXT: vshufi64x2 $0, %ymm0, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x43,0xc8,0x00]
	; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,0,1]			; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,0,1]
	; CHECK-NEXT: vshufi64x2 $0, %ymm0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x43,0xc0,0x00]			; CHECK-NEXT: vshufi64x2 $0, %ymm0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x43,0xc0,0x00]
	; CHECK-NEXT: ## ymm0 = ymm0[0,1,0,1]			; CHECK-NEXT: ## ymm0 = ymm0[0,1,0,1]
	; CHECK-NEXT: vpaddq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc1]			; CHECK-NEXT: vpaddq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc1]
	; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]			; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]

	%res1 = call <4 x i64> @llvm.x86.avx512.mask.broadcasti64x2.256(<2 x i64> %x0, <4 x i64> %x2, i8 -1)			%res1 = call <4 x i64> @llvm.x86.avx512.mask.broadcasti64x2.256(<2 x i64> %x0, <4 x i64> %x2, i8 -1)
	%res2 = call <4 x i64> @llvm.x86.avx512.mask.broadcasti64x2.256(<2 x i64> %x0, <4 x i64> %x2, i8 %mask)			%res2 = call <4 x i64> @llvm.x86.avx512.mask.broadcasti64x2.256(<2 x i64> %x0, <4 x i64> %x2, i8 %mask)
	%res3 = call <4 x i64> @llvm.x86.avx512.mask.broadcasti64x2.256(<2 x i64> %x0, <4 x i64> zeroinitializer, i8 %mask)			%res3 = call <4 x i64> @llvm.x86.avx512.mask.broadcasti64x2.256(<2 x i64> %x0, <4 x i64> zeroinitializer, i8 %mask)
	%res4 = add <4 x i64> %res1, %res2			%res4 = add <4 x i64> %res1, %res2
	%res5 = add <4 x i64> %res3, %res4			%res5 = add <4 x i64> %res3, %res4
	ret <4 x i64> %res5			ret <4 x i64> %res5
	}			}

llvm/trunk/test/CodeGen/X86/avx512ifmavl-intrinsics.ll

	; NOTE: Assertions have been autogenerated by update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl -mattr=+avx512ifma \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl -mattr=+avx512ifma \| FileCheck %s

	declare <2 x i64> @llvm.x86.avx512.mask.vpmadd52h.uq.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)			declare <2 x i64> @llvm.x86.avx512.mask.vpmadd52h.uq.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

	define <2 x i64>@test_int_x86_avx512_mask_vpmadd52h_uq_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {			define <2 x i64>@test_int_x86_avx512_mask_vpmadd52h_uq_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vpmadd52h_uq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_vpmadd52h_uq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vmovaps %xmm0, %xmm3			; CHECK-NEXT: vmovaps %xmm0, %xmm3
	; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm3 {%k1}			; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm3 {%k1}
	; CHECK-NEXT: vmovaps %xmm0, %xmm4			; CHECK-NEXT: vmovaps %xmm0, %xmm4
	; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm4			; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm4
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm0 {%k1}			; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm0 {%k1}
	; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm2 {%k1} {z}			; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm2 {%k1} {z}
	; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0			; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0
	; CHECK-NEXT: vpaddq %xmm2, %xmm4, %xmm1			; CHECK-NEXT: vpaddq %xmm2, %xmm4, %xmm1
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq

	%res = call <2 x i64> @llvm.x86.avx512.mask.vpmadd52h.uq.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)			%res = call <2 x i64> @llvm.x86.avx512.mask.vpmadd52h.uq.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
	Show All 11 Lines
	define <4 x i64>@test_int_x86_avx512_mask_vpmadd52h_uq_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {			define <4 x i64>@test_int_x86_avx512_mask_vpmadd52h_uq_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vpmadd52h_uq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_vpmadd52h_uq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vmovaps %ymm0, %ymm3			; CHECK-NEXT: vmovaps %ymm0, %ymm3
	; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm3 {%k1}			; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm3 {%k1}
	; CHECK-NEXT: vmovaps %ymm0, %ymm4			; CHECK-NEXT: vmovaps %ymm0, %ymm4
	; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm4			; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm4
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm0 {%k1}			; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm0 {%k1}
	; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm2 {%k1} {z}			; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm2 {%k1} {z}
	; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0
	; CHECK-NEXT: vpaddq %ymm2, %ymm4, %ymm1			; CHECK-NEXT: vpaddq %ymm2, %ymm4, %ymm1
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq

	%res = call <4 x i64> @llvm.x86.avx512.mask.vpmadd52h.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)			%res = call <4 x i64> @llvm.x86.avx512.mask.vpmadd52h.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
	Show All 11 Lines
	define <2 x i64>@test_int_x86_avx512_maskz_vpmadd52h_uq_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {			define <2 x i64>@test_int_x86_avx512_maskz_vpmadd52h_uq_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_maskz_vpmadd52h_uq_128:			; CHECK-LABEL: test_int_x86_avx512_maskz_vpmadd52h_uq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vmovaps %xmm0, %xmm3			; CHECK-NEXT: vmovaps %xmm0, %xmm3
	; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm3 {%k1} {z}			; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm3 {%k1} {z}
	; CHECK-NEXT: vmovaps %xmm0, %xmm4			; CHECK-NEXT: vmovaps %xmm0, %xmm4
	; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm4			; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm4
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm0 {%k1} {z}			; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm0 {%k1} {z}
	; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm2 {%k1} {z}			; CHECK-NEXT: vpmadd52huq %xmm2, %xmm1, %xmm2 {%k1} {z}
	; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0			; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0
	; CHECK-NEXT: vpaddq %xmm2, %xmm4, %xmm1			; CHECK-NEXT: vpaddq %xmm2, %xmm4, %xmm1
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq

	%res = call <2 x i64> @llvm.x86.avx512.maskz.vpmadd52h.uq.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)			%res = call <2 x i64> @llvm.x86.avx512.maskz.vpmadd52h.uq.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
	Show All 11 Lines
	define <4 x i64>@test_int_x86_avx512_maskz_vpmadd52h_uq_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {			define <4 x i64>@test_int_x86_avx512_maskz_vpmadd52h_uq_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_maskz_vpmadd52h_uq_256:			; CHECK-LABEL: test_int_x86_avx512_maskz_vpmadd52h_uq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vmovaps %ymm0, %ymm3			; CHECK-NEXT: vmovaps %ymm0, %ymm3
	; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm3 {%k1} {z}			; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm3 {%k1} {z}
	; CHECK-NEXT: vmovaps %ymm0, %ymm4			; CHECK-NEXT: vmovaps %ymm0, %ymm4
	; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm4			; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm4
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm0 {%k1} {z}			; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm0 {%k1} {z}
	; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm2 {%k1} {z}			; CHECK-NEXT: vpmadd52huq %ymm2, %ymm1, %ymm2 {%k1} {z}
	; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0
	; CHECK-NEXT: vpaddq %ymm2, %ymm4, %ymm1			; CHECK-NEXT: vpaddq %ymm2, %ymm4, %ymm1
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq

	%res = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52h.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)			%res = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52h.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
	Show All 11 Lines
	define <2 x i64>@test_int_x86_avx512_mask_vpmadd52l_uq_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {			define <2 x i64>@test_int_x86_avx512_mask_vpmadd52l_uq_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vpmadd52l_uq_128:			; CHECK-LABEL: test_int_x86_avx512_mask_vpmadd52l_uq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vmovaps %xmm0, %xmm3			; CHECK-NEXT: vmovaps %xmm0, %xmm3
	; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm3 {%k1}			; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm3 {%k1}
	; CHECK-NEXT: vmovaps %xmm0, %xmm4			; CHECK-NEXT: vmovaps %xmm0, %xmm4
	; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm4			; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm4
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm0 {%k1}			; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm0 {%k1}
	; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm2 {%k1} {z}			; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm2 {%k1} {z}
	; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0			; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0
	; CHECK-NEXT: vpaddq %xmm2, %xmm4, %xmm1			; CHECK-NEXT: vpaddq %xmm2, %xmm4, %xmm1
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq

	%res = call <2 x i64> @llvm.x86.avx512.mask.vpmadd52l.uq.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)			%res = call <2 x i64> @llvm.x86.avx512.mask.vpmadd52l.uq.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
	Show All 11 Lines
	define <4 x i64>@test_int_x86_avx512_mask_vpmadd52l_uq_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {			define <4 x i64>@test_int_x86_avx512_mask_vpmadd52l_uq_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vpmadd52l_uq_256:			; CHECK-LABEL: test_int_x86_avx512_mask_vpmadd52l_uq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vmovaps %ymm0, %ymm3			; CHECK-NEXT: vmovaps %ymm0, %ymm3
	; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm3 {%k1}			; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm3 {%k1}
	; CHECK-NEXT: vmovaps %ymm0, %ymm4			; CHECK-NEXT: vmovaps %ymm0, %ymm4
	; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm4			; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm4
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm0 {%k1}			; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm0 {%k1}
	; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm2 {%k1} {z}			; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm2 {%k1} {z}
	; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0
	; CHECK-NEXT: vpaddq %ymm2, %ymm4, %ymm1			; CHECK-NEXT: vpaddq %ymm2, %ymm4, %ymm1
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq

	%res = call <4 x i64> @llvm.x86.avx512.mask.vpmadd52l.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)			%res = call <4 x i64> @llvm.x86.avx512.mask.vpmadd52l.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
	Show All 11 Lines
	define <2 x i64>@test_int_x86_avx512_maskz_vpmadd52l_uq_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {			define <2 x i64>@test_int_x86_avx512_maskz_vpmadd52l_uq_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_maskz_vpmadd52l_uq_128:			; CHECK-LABEL: test_int_x86_avx512_maskz_vpmadd52l_uq_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vmovaps %xmm0, %xmm3			; CHECK-NEXT: vmovaps %xmm0, %xmm3
	; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm3 {%k1} {z}			; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm3 {%k1} {z}
	; CHECK-NEXT: vmovaps %xmm0, %xmm4			; CHECK-NEXT: vmovaps %xmm0, %xmm4
	; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm4			; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm4
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm0 {%k1} {z}			; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm0 {%k1} {z}
	; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm2 {%k1} {z}			; CHECK-NEXT: vpmadd52luq %xmm2, %xmm1, %xmm2 {%k1} {z}
	; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0			; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0
	; CHECK-NEXT: vpaddq %xmm2, %xmm4, %xmm1			; CHECK-NEXT: vpaddq %xmm2, %xmm4, %xmm1
	; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0			; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq

	%res = call <2 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)			%res = call <2 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
	Show All 11 Lines
	define <4 x i64>@test_int_x86_avx512_maskz_vpmadd52l_uq_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {			define <4 x i64>@test_int_x86_avx512_maskz_vpmadd52l_uq_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_maskz_vpmadd52l_uq_256:			; CHECK-LABEL: test_int_x86_avx512_maskz_vpmadd52l_uq_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
	; CHECK-NEXT: vmovaps %ymm0, %ymm3			; CHECK-NEXT: vmovaps %ymm0, %ymm3
	; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm3 {%k1} {z}			; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm3 {%k1} {z}
	; CHECK-NEXT: vmovaps %ymm0, %ymm4			; CHECK-NEXT: vmovaps %ymm0, %ymm4
	; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm4			; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm4
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm0 {%k1} {z}			; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm0 {%k1} {z}
	; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm2 {%k1} {z}			; CHECK-NEXT: vpmadd52luq %ymm2, %ymm1, %ymm2 {%k1} {z}
	; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0
	; CHECK-NEXT: vpaddq %ymm2, %ymm4, %ymm1			; CHECK-NEXT: vpaddq %ymm2, %ymm4, %ymm1
	; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0			; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq

	%res = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)			%res = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
	%res1 = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)			%res1 = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
	%res2 = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.256(<4 x i64> zeroinitializer, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)			%res2 = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.256(<4 x i64> zeroinitializer, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
	%res3 = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)			%res3 = call <4 x i64> @llvm.x86.avx512.maskz.vpmadd52l.uq.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
	%res4 = add <4 x i64> %res, %res1			%res4 = add <4 x i64> %res, %res1
	%res5 = add <4 x i64> %res3, %res2			%res5 = add <4 x i64> %res3, %res2
	%res6 = add <4 x i64> %res5, %res4			%res6 = add <4 x i64> %res5, %res4
	ret <4 x i64> %res6			ret <4 x i64> %res6
	}			}

llvm/trunk/test/CodeGen/X86/avx512vbmivl-intrinsics.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; NOTE: Assertions have been autogenerated by update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=+avx512vl -mattr=+avx512vbmi --show-mc-encoding\| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=+avx512vl -mattr=+avx512vbmi --show-mc-encoding\| FileCheck %s
	declare <16 x i8> @llvm.x86.avx512.mask.permvar.qi.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)			declare <16 x i8> @llvm.x86.avx512.mask.permvar.qi.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

	define <16 x i8>@test_int_x86_avx512_mask_permvar_qi_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {			define <16 x i8>@test_int_x86_avx512_mask_permvar_qi_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_permvar_qi_128:			; CHECK-LABEL: test_int_x86_avx512_mask_permvar_qi_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]			; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
	; CHECK-NEXT: vpermb %xmm0, %xmm1, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x75,0x09,0x8d,0xd0]			; CHECK-NEXT: vpermb %xmm0, %xmm1, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x75,0x09,0x8d,0xd0]
	; CHECK-NEXT: vpermb %xmm0, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0x89,0x8d,0xd8]			; CHECK-NEXT: vpermb %xmm0, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0x89,0x8d,0xd8]
	; CHECK-NEXT: vpermb %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf2,0x75,0x08,0x8d,0xc0]			; CHECK-NEXT: vpermb %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf2,0x75,0x08,0x8d,0xc0]
	; CHECK-NEXT: vpaddb %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfc,0xc0]			; CHECK-NEXT: vpaddb %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfc,0xc0]
	; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]			; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.avx512.mask.permvar.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)			%res = call <16 x i8> @llvm.x86.avx512.mask.permvar.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)
	%res1 = call <16 x i8> @llvm.x86.avx512.mask.permvar.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %x3)			%res1 = call <16 x i8> @llvm.x86.avx512.mask.permvar.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %x3)
	%res2 = call <16 x i8> @llvm.x86.avx512.mask.permvar.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)			%res2 = call <16 x i8> @llvm.x86.avx512.mask.permvar.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)
	%res3 = add <16 x i8> %res, %res1			%res3 = add <16 x i8> %res, %res1
	%res4 = add <16 x i8> %res3, %res2			%res4 = add <16 x i8> %res3, %res2
	ret <16 x i8> %res4			ret <16 x i8> %res4
	}			}

	declare <32 x i8> @llvm.x86.avx512.mask.permvar.qi.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)			declare <32 x i8> @llvm.x86.avx512.mask.permvar.qi.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

	define <32 x i8>@test_int_x86_avx512_mask_permvar_qi_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {			define <32 x i8>@test_int_x86_avx512_mask_permvar_qi_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_permvar_qi_256:			; CHECK-LABEL: test_int_x86_avx512_mask_permvar_qi_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]			; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
	; CHECK-NEXT: vpermb %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x75,0x29,0x8d,0xd0]			; CHECK-NEXT: vpermb %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x75,0x29,0x8d,0xd0]
	; CHECK-NEXT: vpermb %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0xa9,0x8d,0xd8]			; CHECK-NEXT: vpermb %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0xa9,0x8d,0xd8]
	; CHECK-NEXT: vpermb %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0x75,0x28,0x8d,0xc0]			; CHECK-NEXT: vpermb %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0x75,0x28,0x8d,0xc0]
	; CHECK-NEXT: vpaddb %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfc,0xc0]			; CHECK-NEXT: vpaddb %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfc,0xc0]
	; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc0]			; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx512.mask.permvar.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)			%res = call <32 x i8> @llvm.x86.avx512.mask.permvar.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
	%res1 = call <32 x i8> @llvm.x86.avx512.mask.permvar.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> zeroinitializer, i32 %x3)			%res1 = call <32 x i8> @llvm.x86.avx512.mask.permvar.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> zeroinitializer, i32 %x3)
	%res2 = call <32 x i8> @llvm.x86.avx512.mask.permvar.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)			%res2 = call <32 x i8> @llvm.x86.avx512.mask.permvar.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
	%res3 = add <32 x i8> %res, %res1			%res3 = add <32 x i8> %res, %res1
	%res4 = add <32 x i8> %res3, %res2			%res4 = add <32 x i8> %res3, %res2
	ret <32 x i8> %res4			ret <32 x i8> %res4
	}			}

	declare <16 x i8> @llvm.x86.avx512.mask.pmultishift.qb.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)			declare <16 x i8> @llvm.x86.avx512.mask.pmultishift.qb.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

	define <16 x i8>@test_int_x86_avx512_mask_pmultishift_qb_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {			define <16 x i8>@test_int_x86_avx512_mask_pmultishift_qb_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_pmultishift_qb_128:			; CHECK-LABEL: test_int_x86_avx512_mask_pmultishift_qb_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]			; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
	; CHECK-NEXT: vpmultishiftqb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x83,0xd1]			; CHECK-NEXT: vpmultishiftqb %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x83,0xd1]
	; CHECK-NEXT: vpmultishiftqb %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x83,0xd9]			; CHECK-NEXT: vpmultishiftqb %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x83,0xd9]
	; CHECK-NEXT: vpmultishiftqb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x83,0xc1]			; CHECK-NEXT: vpmultishiftqb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x83,0xc1]
	; CHECK-NEXT: vpaddb %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfc,0xc0]			; CHECK-NEXT: vpaddb %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfc,0xc0]
	; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfc,0xc0]			; CHECK-NEXT: vpaddb %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfc,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.avx512.mask.pmultishift.qb.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)			%res = call <16 x i8> @llvm.x86.avx512.mask.pmultishift.qb.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)
	%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmultishift.qb.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %x3)			%res1 = call <16 x i8> @llvm.x86.avx512.mask.pmultishift.qb.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> zeroinitializer, i16 %x3)
	%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmultishift.qb.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)			%res2 = call <16 x i8> @llvm.x86.avx512.mask.pmultishift.qb.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)
	%res3 = add <16 x i8> %res, %res1			%res3 = add <16 x i8> %res, %res1
	%res4 = add <16 x i8> %res3, %res2			%res4 = add <16 x i8> %res3, %res2
	ret <16 x i8> %res4			ret <16 x i8> %res4
	}			}

	declare <32 x i8> @llvm.x86.avx512.mask.pmultishift.qb.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)			declare <32 x i8> @llvm.x86.avx512.mask.pmultishift.qb.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

	define <32 x i8>@test_int_x86_avx512_mask_pmultishift_qb_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {			define <32 x i8>@test_int_x86_avx512_mask_pmultishift_qb_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_pmultishift_qb_256:			; CHECK-LABEL: test_int_x86_avx512_mask_pmultishift_qb_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]			; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
	; CHECK-NEXT: vpmultishiftqb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x83,0xd1]			; CHECK-NEXT: vpmultishiftqb %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x83,0xd1]
	; CHECK-NEXT: vpmultishiftqb %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x83,0xd9]			; CHECK-NEXT: vpmultishiftqb %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x83,0xd9]
	; CHECK-NEXT: vpmultishiftqb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x83,0xc1]			; CHECK-NEXT: vpmultishiftqb %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x83,0xc1]
	; CHECK-NEXT: vpaddb %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfc,0xc0]			; CHECK-NEXT: vpaddb %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfc,0xc0]
	; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfc,0xc0]			; CHECK-NEXT: vpaddb %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfc,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx512.mask.pmultishift.qb.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)			%res = call <32 x i8> @llvm.x86.avx512.mask.pmultishift.qb.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
	%res1 = call <32 x i8> @llvm.x86.avx512.mask.pmultishift.qb.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> zeroinitializer, i32 %x3)			%res1 = call <32 x i8> @llvm.x86.avx512.mask.pmultishift.qb.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> zeroinitializer, i32 %x3)
	%res2 = call <32 x i8> @llvm.x86.avx512.mask.pmultishift.qb.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)			%res2 = call <32 x i8> @llvm.x86.avx512.mask.pmultishift.qb.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
	%res3 = add <32 x i8> %res, %res1			%res3 = add <32 x i8> %res, %res1
	%res4 = add <32 x i8> %res3, %res2			%res4 = add <32 x i8> %res3, %res2
	ret <32 x i8> %res4			ret <32 x i8> %res4
	}			}

	declare <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)			declare <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

	define <16 x i8>@test_int_x86_avx512_mask_vpermi2var_qi_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {			define <16 x i8>@test_int_x86_avx512_mask_vpermi2var_qi_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_qi_128:			; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_qi_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]			; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
	; CHECK-NEXT: vmovdqa64 %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd9]			; CHECK-NEXT: vmovdqa %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd9]
	; CHECK-NEXT: vpermi2b %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x75,0xda]			; CHECK-NEXT: vpermi2b %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x75,0xda]
	; CHECK-NEXT: vpermi2b %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x75,0xca]			; CHECK-NEXT: vpermi2b %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x75,0xca]
	; CHECK-NEXT: vpxord %xmm4, %xmm4, %xmm4 ## encoding: [0x62,0xf1,0x5d,0x08,0xef,0xe4]			; CHECK-NEXT: vpxor %xmm4, %xmm4, %xmm4 ## EVEX TO VEX Compression encoding: [0xc5,0xd9,0xef,0xe4]
	; CHECK-NEXT: vpermi2b %xmm2, %xmm0, %xmm4 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x75,0xe2]			; CHECK-NEXT: vpermi2b %xmm2, %xmm0, %xmm4 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x75,0xe2]
	; CHECK-NEXT: vpaddb %xmm1, %xmm4, %xmm0 ## encoding: [0x62,0xf1,0x5d,0x08,0xfc,0xc1]			; CHECK-NEXT: vpaddb %xmm1, %xmm4, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xd9,0xfc,0xc1]
	; CHECK-NEXT: vpaddb %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfc,0xc0]			; CHECK-NEXT: vpaddb %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfc,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)			%res = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)
	%res1 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> zeroinitializer, <16 x i8> %x2, i16 %x3)			%res1 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> zeroinitializer, <16 x i8> %x2, i16 %x3)
	%res2 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)			%res2 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)
	%res3 = add <16 x i8> %res, %res1			%res3 = add <16 x i8> %res, %res1
	%res4 = add <16 x i8> %res3, %res2			%res4 = add <16 x i8> %res3, %res2
	ret <16 x i8> %res4			ret <16 x i8> %res4
	}			}

	declare <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)			declare <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

	define <32 x i8>@test_int_x86_avx512_mask_vpermi2var_qi_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {			define <32 x i8>@test_int_x86_avx512_mask_vpermi2var_qi_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_qi_256:			; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_qi_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]			; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
	; CHECK-NEXT: vmovdqa64 %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd9]			; CHECK-NEXT: vmovdqa %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd9]
	; CHECK-NEXT: vpermi2b %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x75,0xda]			; CHECK-NEXT: vpermi2b %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x75,0xda]
	; CHECK-NEXT: vpermi2b %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x75,0xca]			; CHECK-NEXT: vpermi2b %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x75,0xca]
	; CHECK-NEXT: vpxord %ymm4, %ymm4, %ymm4 ## encoding: [0x62,0xf1,0x5d,0x28,0xef,0xe4]			; CHECK-NEXT: vpxor %ymm4, %ymm4, %ymm4 ## EVEX TO VEX Compression encoding: [0xc5,0xdd,0xef,0xe4]
	; CHECK-NEXT: vpermi2b %ymm2, %ymm0, %ymm4 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x75,0xe2]			; CHECK-NEXT: vpermi2b %ymm2, %ymm0, %ymm4 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x75,0xe2]
	; CHECK-NEXT: vpaddb %ymm1, %ymm4, %ymm0 ## encoding: [0x62,0xf1,0x5d,0x28,0xfc,0xc1]			; CHECK-NEXT: vpaddb %ymm1, %ymm4, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xdd,0xfc,0xc1]
	; CHECK-NEXT: vpaddb %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfc,0xc0]			; CHECK-NEXT: vpaddb %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfc,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)			%res = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
	%res1 = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %x0, <32 x i8> zeroinitializer, <32 x i8> %x2, i32 %x3)			%res1 = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %x0, <32 x i8> zeroinitializer, <32 x i8> %x2, i32 %x3)
	%res2 = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)			%res2 = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
	%res3 = add <32 x i8> %res, %res1			%res3 = add <32 x i8> %res, %res1
	%res4 = add <32 x i8> %res3, %res2			%res4 = add <32 x i8> %res3, %res2
	ret <32 x i8> %res4			ret <32 x i8> %res4
	}			}

	declare <16 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)			declare <16 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.128(<16 x i8>, <16 x i8>, <16 x i8>, i16)

	define <16 x i8>@test_int_x86_avx512_mask_vpermt2var_qi_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {			define <16 x i8>@test_int_x86_avx512_mask_vpermt2var_qi_128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_qi_128:			; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_qi_128:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]			; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
	; CHECK-NEXT: vmovdqa64 %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd9]			; CHECK-NEXT: vmovdqa %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd9]
	; CHECK-NEXT: vpermt2b %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7d,0xda]			; CHECK-NEXT: vpermt2b %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7d,0xda]
	; CHECK-NEXT: vpermt2b %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x7d,0xca]			; CHECK-NEXT: vpermt2b %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x7d,0xca]
	; CHECK-NEXT: vpxord %xmm4, %xmm4, %xmm4 ## encoding: [0x62,0xf1,0x5d,0x08,0xef,0xe4]			; CHECK-NEXT: vpxor %xmm4, %xmm4, %xmm4 ## EVEX TO VEX Compression encoding: [0xc5,0xd9,0xef,0xe4]
	; CHECK-NEXT: vpermt2b %xmm2, %xmm0, %xmm4 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7d,0xe2]			; CHECK-NEXT: vpermt2b %xmm2, %xmm0, %xmm4 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7d,0xe2]
	; CHECK-NEXT: vpaddb %xmm1, %xmm4, %xmm0 ## encoding: [0x62,0xf1,0x5d,0x08,0xfc,0xc1]			; CHECK-NEXT: vpaddb %xmm1, %xmm4, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xd9,0xfc,0xc1]
	; CHECK-NEXT: vpaddb %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfc,0xc0]			; CHECK-NEXT: vpaddb %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfc,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)			%res = call <16 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 %x3)
	%res1 = call <16 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.128(<16 x i8> %x0, <16 x i8> zeroinitializer, <16 x i8> %x2, i16 %x3)			%res1 = call <16 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.128(<16 x i8> %x0, <16 x i8> zeroinitializer, <16 x i8> %x2, i16 %x3)
	%res2 = call <16 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)			%res2 = call <16 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.128(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, i16 -1)
	%res3 = add <16 x i8> %res, %res1			%res3 = add <16 x i8> %res, %res1
	%res4 = add <16 x i8> %res3, %res2			%res4 = add <16 x i8> %res3, %res2
	ret <16 x i8> %res4			ret <16 x i8> %res4
	}			}

	declare <32 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)			declare <32 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.256(<32 x i8>, <32 x i8>, <32 x i8>, i32)

	define <32 x i8>@test_int_x86_avx512_mask_vpermt2var_qi_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {			define <32 x i8>@test_int_x86_avx512_mask_vpermt2var_qi_256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3) {
	; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_qi_256:			; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_qi_256:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]			; CHECK-NEXT: kmovd %edi, %k1 ## encoding: [0xc5,0xfb,0x92,0xcf]
	; CHECK-NEXT: vmovdqa64 %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd9]			; CHECK-NEXT: vmovdqa %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd9]
	; CHECK-NEXT: vpermt2b %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7d,0xda]			; CHECK-NEXT: vpermt2b %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7d,0xda]
	; CHECK-NEXT: vpermt2b %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x7d,0xca]			; CHECK-NEXT: vpermt2b %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x7d,0xca]
	; CHECK-NEXT: vpxord %ymm4, %ymm4, %ymm4 ## encoding: [0x62,0xf1,0x5d,0x28,0xef,0xe4]			; CHECK-NEXT: vpxor %ymm4, %ymm4, %ymm4 ## EVEX TO VEX Compression encoding: [0xc5,0xdd,0xef,0xe4]
	; CHECK-NEXT: vpermt2b %ymm2, %ymm0, %ymm4 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7d,0xe2]			; CHECK-NEXT: vpermt2b %ymm2, %ymm0, %ymm4 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7d,0xe2]
	; CHECK-NEXT: vpaddb %ymm1, %ymm4, %ymm0 ## encoding: [0x62,0xf1,0x5d,0x28,0xfc,0xc1]			; CHECK-NEXT: vpaddb %ymm1, %ymm4, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xdd,0xfc,0xc1]
	; CHECK-NEXT: vpaddb %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfc,0xc0]			; CHECK-NEXT: vpaddb %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfc,0xc0]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%res = call <32 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)			%res = call <32 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 %x3)
	%res1 = call <32 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.256(<32 x i8> %x0, <32 x i8> zeroinitializer, <32 x i8> %x2, i32 %x3)			%res1 = call <32 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.256(<32 x i8> %x0, <32 x i8> zeroinitializer, <32 x i8> %x2, i32 %x3)
	%res2 = call <32 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)			%res2 = call <32 x i8> @llvm.x86.avx512.mask.vpermt2var.qi.256(<32 x i8> %x0, <32 x i8> %x1, <32 x i8> %x2, i32 -1)
	%res3 = add <32 x i8> %res, %res1			%res3 = add <32 x i8> %res, %res1
	%res4 = add <32 x i8> %res3, %res2			%res4 = add <32 x i8> %res3, %res2
	ret <32 x i8> %res4			ret <32 x i8> %res4
	}			}
	Show All 24 Lines

llvm/trunk/test/CodeGen/X86/avx512vl-intrinsics-upgrade.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl --show-mc-encoding\| FileCheck %s		; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl --show-mc-encoding\| FileCheck %s

declare <8 x i32> @llvm.x86.avx512.pbroadcastd.256(<4 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.pbroadcastd.256(<4 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_pbroadcastd_256(<4 x i32> %x0, <8 x i32> %x1, i8 %mask, i32 * %y_ptr) {		define <8 x i32>@test_int_x86_avx512_pbroadcastd_256(<4 x i32> %x0, <8 x i32> %x1, i8 %mask, i32 * %y_ptr) {
; CHECK-LABEL: test_int_x86_avx512_pbroadcastd_256:		; CHECK-LABEL: test_int_x86_avx512_pbroadcastd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpbroadcastd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x58,0xc8]		; CHECK-NEXT: vpbroadcastd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x58,0xc8]
; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x58,0xc0]		; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x58,0xc0]
; CHECK-NEXT: vpaddd (%rsi){1to8}, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x38,0xfe,0x0e]		; CHECK-NEXT: vpaddd (%rsi){1to8}, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x38,0xfe,0x0e]
; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc1]		; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%y_32 = load i32, i32 * %y_ptr		%y_32 = load i32, i32 * %y_ptr
%y = insertelement <4 x i32> undef, i32 %y_32, i32 0		%y = insertelement <4 x i32> undef, i32 %y_32, i32 0
%res = call <8 x i32> @llvm.x86.avx512.pbroadcastd.256(<4 x i32> %y, <8 x i32> %x1, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.pbroadcastd.256(<4 x i32> %y, <8 x i32> %x1, i8 -1)
%res1 = call <8 x i32> @llvm.x86.avx512.pbroadcastd.256(<4 x i32> %x0, <8 x i32> %x1, i8 %mask)		%res1 = call <8 x i32> @llvm.x86.avx512.pbroadcastd.256(<4 x i32> %x0, <8 x i32> %x1, i8 %mask)
%res2 = call <8 x i32> @llvm.x86.avx512.pbroadcastd.256(<4 x i32> %x0, <8 x i32> zeroinitializer, i8 %mask)		%res2 = call <8 x i32> @llvm.x86.avx512.pbroadcastd.256(<4 x i32> %x0, <8 x i32> zeroinitializer, i8 %mask)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res2, %res3		%res4 = add <8 x i32> %res2, %res3
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.pbroadcastd.128(<4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.pbroadcastd.128(<4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_pbroadcastd_128(<4 x i32> %x0, <4 x i32> %x1, i8 %mask) {		define <4 x i32>@test_int_x86_avx512_pbroadcastd_128(<4 x i32> %x0, <4 x i32> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_pbroadcastd_128:		; CHECK-LABEL: test_int_x86_avx512_pbroadcastd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpbroadcastd %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x58,0xd0]		; CHECK-NEXT: vpbroadcastd %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x58,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpbroadcastd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x58,0xc8]		; CHECK-NEXT: vpbroadcastd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x58,0xc8]
; CHECK-NEXT: vpbroadcastd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x58,0xc0]		; CHECK-NEXT: vpbroadcastd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x58,0xc0]
; CHECK-NEXT: vpaddd %xmm1, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc9]		; CHECK-NEXT: vpaddd %xmm1, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc9]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.pbroadcastd.128(<4 x i32> %x0, <4 x i32> %x1, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.pbroadcastd.128(<4 x i32> %x0, <4 x i32> %x1, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.pbroadcastd.128(<4 x i32> %x0, <4 x i32> %x1, i8 %mask)		%res1 = call <4 x i32> @llvm.x86.avx512.pbroadcastd.128(<4 x i32> %x0, <4 x i32> %x1, i8 %mask)
%res2 = call <4 x i32> @llvm.x86.avx512.pbroadcastd.128(<4 x i32> %x0, <4 x i32> zeroinitializer, i8 %mask)		%res2 = call <4 x i32> @llvm.x86.avx512.pbroadcastd.128(<4 x i32> %x0, <4 x i32> zeroinitializer, i8 %mask)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res2, %res3		%res4 = add <4 x i32> %res2, %res3
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.pbroadcastq.256(<2 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.pbroadcastq.256(<2 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_pbroadcastq_256(<2 x i64> %x0, <4 x i64> %x1, i8 %mask) {		define <4 x i64>@test_int_x86_avx512_pbroadcastq_256(<2 x i64> %x0, <4 x i64> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_pbroadcastq_256:		; CHECK-LABEL: test_int_x86_avx512_pbroadcastq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpbroadcastq %xmm0, %ymm2 ## encoding: [0x62,0xf2,0xfd,0x28,0x59,0xd0]		; CHECK-NEXT: vpbroadcastq %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x59,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpbroadcastq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x59,0xc8]		; CHECK-NEXT: vpbroadcastq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x59,0xc8]
; CHECK-NEXT: vpbroadcastq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x59,0xc0]		; CHECK-NEXT: vpbroadcastq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x59,0xc0]
; CHECK-NEXT: vpaddq %ymm1, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc9]		; CHECK-NEXT: vpaddq %ymm1, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc9]
; CHECK-NEXT: vpaddq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc1]		; CHECK-NEXT: vpaddq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.pbroadcastq.256(<2 x i64> %x0, <4 x i64> %x1,i8 -1)		%res = call <4 x i64> @llvm.x86.avx512.pbroadcastq.256(<2 x i64> %x0, <4 x i64> %x1,i8 -1)
%res1 = call <4 x i64> @llvm.x86.avx512.pbroadcastq.256(<2 x i64> %x0, <4 x i64> %x1,i8 %mask)		%res1 = call <4 x i64> @llvm.x86.avx512.pbroadcastq.256(<2 x i64> %x0, <4 x i64> %x1,i8 %mask)
%res2 = call <4 x i64> @llvm.x86.avx512.pbroadcastq.256(<2 x i64> %x0, <4 x i64> zeroinitializer,i8 %mask)		%res2 = call <4 x i64> @llvm.x86.avx512.pbroadcastq.256(<2 x i64> %x0, <4 x i64> zeroinitializer,i8 %mask)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res2, %res3		%res4 = add <4 x i64> %res2, %res3
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.pbroadcastq.128(<2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.pbroadcastq.128(<2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_pbroadcastq_128(<2 x i64> %x0, <2 x i64> %x1, i8 %mask) {		define <2 x i64>@test_int_x86_avx512_pbroadcastq_128(<2 x i64> %x0, <2 x i64> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_pbroadcastq_128:		; CHECK-LABEL: test_int_x86_avx512_pbroadcastq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpbroadcastq %xmm0, %xmm2 ## encoding: [0x62,0xf2,0xfd,0x08,0x59,0xd0]		; CHECK-NEXT: vpbroadcastq %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x59,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpbroadcastq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x59,0xc8]		; CHECK-NEXT: vpbroadcastq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x59,0xc8]
; CHECK-NEXT: vpbroadcastq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x59,0xc0]		; CHECK-NEXT: vpbroadcastq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x59,0xc0]
; CHECK-NEXT: vpaddq %xmm1, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc9]		; CHECK-NEXT: vpaddq %xmm1, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc9]
; CHECK-NEXT: vpaddq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc1]		; CHECK-NEXT: vpaddq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.pbroadcastq.128(<2 x i64> %x0, <2 x i64> %x1,i8 -1)		%res = call <2 x i64> @llvm.x86.avx512.pbroadcastq.128(<2 x i64> %x0, <2 x i64> %x1,i8 -1)
%res1 = call <2 x i64> @llvm.x86.avx512.pbroadcastq.128(<2 x i64> %x0, <2 x i64> %x1,i8 %mask)		%res1 = call <2 x i64> @llvm.x86.avx512.pbroadcastq.128(<2 x i64> %x0, <2 x i64> %x1,i8 %mask)
%res2 = call <2 x i64> @llvm.x86.avx512.pbroadcastq.128(<2 x i64> %x0, <2 x i64> zeroinitializer,i8 %mask)		%res2 = call <2 x i64> @llvm.x86.avx512.pbroadcastq.128(<2 x i64> %x0, <2 x i64> zeroinitializer,i8 %mask)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res2, %res3		%res4 = add <2 x i64> %res2, %res3
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.broadcast.sd.pd.256(<2 x double>, <4 x double>, i8) nounwind readonly		declare <4 x double> @llvm.x86.avx512.mask.broadcast.sd.pd.256(<2 x double>, <4 x double>, i8) nounwind readonly

define <4 x double> @test_x86_vbroadcast_sd_pd_256(<2 x double> %a0, <4 x double> %a1, i8 %mask ) {		define <4 x double> @test_x86_vbroadcast_sd_pd_256(<2 x double> %a0, <4 x double> %a1, i8 %mask ) {
; CHECK-LABEL: test_x86_vbroadcast_sd_pd_256:		; CHECK-LABEL: test_x86_vbroadcast_sd_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vbroadcastsd %xmm0, %ymm2 ## encoding: [0x62,0xf2,0xfd,0x28,0x19,0xd0]		; CHECK-NEXT: vbroadcastsd %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x19,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vbroadcastsd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x19,0xc8]		; CHECK-NEXT: vbroadcastsd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x19,0xc8]
; CHECK-NEXT: vbroadcastsd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x19,0xc0]		; CHECK-NEXT: vbroadcastsd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x19,0xc0]
; CHECK-NEXT: vaddpd %ymm1, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc9]		; CHECK-NEXT: vaddpd %ymm1, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc9]
; CHECK-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.broadcast.sd.pd.256(<2 x double> %a0, <4 x double> zeroinitializer, i8 -1)		%res = call <4 x double> @llvm.x86.avx512.mask.broadcast.sd.pd.256(<2 x double> %a0, <4 x double> zeroinitializer, i8 -1)
%res1 = call <4 x double> @llvm.x86.avx512.mask.broadcast.sd.pd.256(<2 x double> %a0, <4 x double> %a1, i8 %mask)		%res1 = call <4 x double> @llvm.x86.avx512.mask.broadcast.sd.pd.256(<2 x double> %a0, <4 x double> %a1, i8 %mask)
%res2 = call <4 x double> @llvm.x86.avx512.mask.broadcast.sd.pd.256(<2 x double> %a0, <4 x double> zeroinitializer, i8 %mask)		%res2 = call <4 x double> @llvm.x86.avx512.mask.broadcast.sd.pd.256(<2 x double> %a0, <4 x double> zeroinitializer, i8 %mask)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res2, %res3		%res4 = fadd <4 x double> %res2, %res3
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.256(<4 x float>, <8 x float>, i8) nounwind readonly		declare <8 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.256(<4 x float>, <8 x float>, i8) nounwind readonly

define <8 x float> @test_x86_vbroadcast_ss_ps_256(<4 x float> %a0, <8 x float> %a1, i8 %mask ) {		define <8 x float> @test_x86_vbroadcast_ss_ps_256(<4 x float> %a0, <8 x float> %a1, i8 %mask ) {
; CHECK-LABEL: test_x86_vbroadcast_ss_ps_256:		; CHECK-LABEL: test_x86_vbroadcast_ss_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vbroadcastss %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x18,0xd0]		; CHECK-NEXT: vbroadcastss %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x18,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vbroadcastss %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x18,0xc8]		; CHECK-NEXT: vbroadcastss %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x18,0xc8]
; CHECK-NEXT: vbroadcastss %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x18,0xc0]		; CHECK-NEXT: vbroadcastss %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x18,0xc0]
; CHECK-NEXT: vaddps %ymm1, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc9]		; CHECK-NEXT: vaddps %ymm1, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc9]
; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.256(<4 x float> %a0, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.256(<4 x float> %a0, <8 x float> zeroinitializer, i8 -1)
%res1 = call <8 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.256(<4 x float> %a0, <8 x float> %a1, i8 %mask)		%res1 = call <8 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.256(<4 x float> %a0, <8 x float> %a1, i8 %mask)
%res2 = call <8 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.256(<4 x float> %a0, <8 x float> zeroinitializer, i8 %mask)		%res2 = call <8 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.256(<4 x float> %a0, <8 x float> zeroinitializer, i8 %mask)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res2, %res3		%res4 = fadd <8 x float> %res2, %res3
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <4 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.128(<4 x float>, <4 x float>, i8) nounwind readonly		declare <4 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.128(<4 x float>, <4 x float>, i8) nounwind readonly

define <4 x float> @test_x86_vbroadcast_ss_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask ) {		define <4 x float> @test_x86_vbroadcast_ss_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask ) {
; CHECK-LABEL: test_x86_vbroadcast_ss_ps_128:		; CHECK-LABEL: test_x86_vbroadcast_ss_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vbroadcastss %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x18,0xd0]		; CHECK-NEXT: vbroadcastss %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x18,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vbroadcastss %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x18,0xc8]		; CHECK-NEXT: vbroadcastss %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x18,0xc8]
; CHECK-NEXT: vbroadcastss %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x18,0xc0]		; CHECK-NEXT: vbroadcastss %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x18,0xc0]
; CHECK-NEXT: vaddps %xmm1, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6c,0x08,0x58,0xc9]		; CHECK-NEXT: vaddps %xmm1, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe8,0x58,0xc9]
; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.128(<4 x float> %a0, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.128(<4 x float> %a0, <4 x float> zeroinitializer, i8 -1)
%res1 = call <4 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.128(<4 x float> %a0, <4 x float> %a1, i8 %mask)		%res1 = call <4 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.128(<4 x float> %a0, <4 x float> %a1, i8 %mask)
%res2 = call <4 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.128(<4 x float> %a0, <4 x float> zeroinitializer, i8 %mask)		%res2 = call <4 x float> @llvm.x86.avx512.mask.broadcast.ss.ps.128(<4 x float> %a0, <4 x float> zeroinitializer, i8 %mask)
%res3 = fadd <4 x float> %res, %res1		%res3 = fadd <4 x float> %res, %res1
%res4 = fadd <4 x float> %res2, %res3		%res4 = fadd <4 x float> %res2, %res3
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <4 x float> @llvm.x86.avx512.mask.movsldup.128(<4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.movsldup.128(<4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_movsldup_128(<4 x float> %x0, <4 x float> %x1, i8 %x2) {		define <4 x float>@test_int_x86_avx512_mask_movsldup_128(<4 x float> %x0, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_movsldup_128:		; CHECK-LABEL: test_int_x86_avx512_mask_movsldup_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovsldup %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x7e,0x08,0x12,0xd0]		; CHECK-NEXT: vmovsldup %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x12,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[0,0,2,2]		; CHECK-NEXT: ## xmm2 = xmm0[0,0,2,2]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovsldup %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x12,0xc8]		; CHECK-NEXT: vmovsldup %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x12,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0,0,2,2]		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0,0,2,2]
; CHECK-NEXT: vmovsldup %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x12,0xc0]		; CHECK-NEXT: vmovsldup %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x12,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0,0,2,2]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0,0,2,2]
; CHECK-NEXT: vaddps %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xca]		; CHECK-NEXT: vaddps %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.movsldup.128(<4 x float> %x0, <4 x float> %x1, i8 %x2)		%res = call <4 x float> @llvm.x86.avx512.mask.movsldup.128(<4 x float> %x0, <4 x float> %x1, i8 %x2)
%res1 = call <4 x float> @llvm.x86.avx512.mask.movsldup.128(<4 x float> %x0, <4 x float> %x1, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.movsldup.128(<4 x float> %x0, <4 x float> %x1, i8 -1)
%res2 = call <4 x float> @llvm.x86.avx512.mask.movsldup.128(<4 x float> %x0, <4 x float> zeroinitializer, i8 %x2)		%res2 = call <4 x float> @llvm.x86.avx512.mask.movsldup.128(<4 x float> %x0, <4 x float> zeroinitializer, i8 %x2)
%res3 = fadd <4 x float> %res, %res1		%res3 = fadd <4 x float> %res, %res1
%res4 = fadd <4 x float> %res2, %res3		%res4 = fadd <4 x float> %res2, %res3
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.movsldup.256(<8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.movsldup.256(<8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_movsldup_256(<8 x float> %x0, <8 x float> %x1, i8 %x2) {		define <8 x float>@test_int_x86_avx512_mask_movsldup_256(<8 x float> %x0, <8 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_movsldup_256:		; CHECK-LABEL: test_int_x86_avx512_mask_movsldup_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovsldup %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x7e,0x28,0x12,0xd0]		; CHECK-NEXT: vmovsldup %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x12,0xd0]
; CHECK-NEXT: ## ymm2 = ymm0[0,0,2,2,4,4,6,6]		; CHECK-NEXT: ## ymm2 = ymm0[0,0,2,2,4,4,6,6]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovsldup %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x12,0xc8]		; CHECK-NEXT: vmovsldup %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x12,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,0,2,2,4,4,6,6]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,0,2,2,4,4,6,6]
; CHECK-NEXT: vmovsldup %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x12,0xc0]		; CHECK-NEXT: vmovsldup %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x12,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[0,0,2,2,4,4,6,6]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[0,0,2,2,4,4,6,6]
; CHECK-NEXT: vaddps %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xca]		; CHECK-NEXT: vaddps %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.movsldup.256(<8 x float> %x0, <8 x float> %x1, i8 %x2)		%res = call <8 x float> @llvm.x86.avx512.mask.movsldup.256(<8 x float> %x0, <8 x float> %x1, i8 %x2)
%res1 = call <8 x float> @llvm.x86.avx512.mask.movsldup.256(<8 x float> %x0, <8 x float> %x1, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.movsldup.256(<8 x float> %x0, <8 x float> %x1, i8 -1)
%res2 = call <8 x float> @llvm.x86.avx512.mask.movsldup.256(<8 x float> %x0, <8 x float> zeroinitializer, i8 %x2)		%res2 = call <8 x float> @llvm.x86.avx512.mask.movsldup.256(<8 x float> %x0, <8 x float> zeroinitializer, i8 %x2)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res2, %res3		%res4 = fadd <8 x float> %res2, %res3
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <4 x float> @llvm.x86.avx512.mask.movshdup.128(<4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.movshdup.128(<4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_movshdup_128(<4 x float> %x0, <4 x float> %x1, i8 %x2) {		define <4 x float>@test_int_x86_avx512_mask_movshdup_128(<4 x float> %x0, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_movshdup_128:		; CHECK-LABEL: test_int_x86_avx512_mask_movshdup_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovshdup %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x7e,0x08,0x16,0xd0]		; CHECK-NEXT: vmovshdup %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x16,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[1,1,3,3]		; CHECK-NEXT: ## xmm2 = xmm0[1,1,3,3]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovshdup %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x16,0xc8]		; CHECK-NEXT: vmovshdup %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x16,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[1,1,3,3]		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[1,1,3,3]
; CHECK-NEXT: vmovshdup %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x16,0xc0]		; CHECK-NEXT: vmovshdup %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x16,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[1,1,3,3]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[1,1,3,3]
; CHECK-NEXT: vaddps %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xca]		; CHECK-NEXT: vaddps %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.movshdup.128(<4 x float> %x0, <4 x float> %x1, i8 %x2)		%res = call <4 x float> @llvm.x86.avx512.mask.movshdup.128(<4 x float> %x0, <4 x float> %x1, i8 %x2)
%res1 = call <4 x float> @llvm.x86.avx512.mask.movshdup.128(<4 x float> %x0, <4 x float> %x1, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.movshdup.128(<4 x float> %x0, <4 x float> %x1, i8 -1)
%res2 = call <4 x float> @llvm.x86.avx512.mask.movshdup.128(<4 x float> %x0, <4 x float> zeroinitializer, i8 %x2)		%res2 = call <4 x float> @llvm.x86.avx512.mask.movshdup.128(<4 x float> %x0, <4 x float> zeroinitializer, i8 %x2)
%res3 = fadd <4 x float> %res, %res1		%res3 = fadd <4 x float> %res, %res1
%res4 = fadd <4 x float> %res2, %res3		%res4 = fadd <4 x float> %res2, %res3
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.movshdup.256(<8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.movshdup.256(<8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_movshdup_256(<8 x float> %x0, <8 x float> %x1, i8 %x2) {		define <8 x float>@test_int_x86_avx512_mask_movshdup_256(<8 x float> %x0, <8 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_movshdup_256:		; CHECK-LABEL: test_int_x86_avx512_mask_movshdup_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovshdup %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x7e,0x28,0x16,0xd0]		; CHECK-NEXT: vmovshdup %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x16,0xd0]
; CHECK-NEXT: ## ymm2 = ymm0[1,1,3,3,5,5,7,7]		; CHECK-NEXT: ## ymm2 = ymm0[1,1,3,3,5,5,7,7]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovshdup %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x16,0xc8]		; CHECK-NEXT: vmovshdup %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x16,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[1,1,3,3,5,5,7,7]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[1,1,3,3,5,5,7,7]
; CHECK-NEXT: vmovshdup %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x16,0xc0]		; CHECK-NEXT: vmovshdup %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x16,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[1,1,3,3,5,5,7,7]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[1,1,3,3,5,5,7,7]
; CHECK-NEXT: vaddps %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xca]		; CHECK-NEXT: vaddps %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.movshdup.256(<8 x float> %x0, <8 x float> %x1, i8 %x2)		%res = call <8 x float> @llvm.x86.avx512.mask.movshdup.256(<8 x float> %x0, <8 x float> %x1, i8 %x2)
%res1 = call <8 x float> @llvm.x86.avx512.mask.movshdup.256(<8 x float> %x0, <8 x float> %x1, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.movshdup.256(<8 x float> %x0, <8 x float> %x1, i8 -1)
%res2 = call <8 x float> @llvm.x86.avx512.mask.movshdup.256(<8 x float> %x0, <8 x float> zeroinitializer, i8 %x2)		%res2 = call <8 x float> @llvm.x86.avx512.mask.movshdup.256(<8 x float> %x0, <8 x float> zeroinitializer, i8 %x2)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res2, %res3		%res4 = fadd <8 x float> %res2, %res3
ret <8 x float> %res4		ret <8 x float> %res4
}		}
declare <2 x double> @llvm.x86.avx512.mask.movddup.128(<2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.movddup.128(<2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_movddup_128(<2 x double> %x0, <2 x double> %x1, i8 %x2) {		define <2 x double>@test_int_x86_avx512_mask_movddup_128(<2 x double> %x0, <2 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_movddup_128:		; CHECK-LABEL: test_int_x86_avx512_mask_movddup_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovddup %xmm0, %xmm2 ## encoding: [0x62,0xf1,0xff,0x08,0x12,0xd0]		; CHECK-NEXT: vmovddup %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x12,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[0,0]		; CHECK-NEXT: ## xmm2 = xmm0[0,0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovddup %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x12,0xc8]		; CHECK-NEXT: vmovddup %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x12,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0,0]		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0,0]
; CHECK-NEXT: vmovddup %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0x89,0x12,0xc0]		; CHECK-NEXT: vmovddup %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0x89,0x12,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0,0]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0,0]
; CHECK-NEXT: vaddpd %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xca]		; CHECK-NEXT: vaddpd %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.movddup.128(<2 x double> %x0, <2 x double> %x1, i8 %x2)		%res = call <2 x double> @llvm.x86.avx512.mask.movddup.128(<2 x double> %x0, <2 x double> %x1, i8 %x2)
%res1 = call <2 x double> @llvm.x86.avx512.mask.movddup.128(<2 x double> %x0, <2 x double> %x1, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.movddup.128(<2 x double> %x0, <2 x double> %x1, i8 -1)
%res2 = call <2 x double> @llvm.x86.avx512.mask.movddup.128(<2 x double> %x0, <2 x double> zeroinitializer, i8 %x2)		%res2 = call <2 x double> @llvm.x86.avx512.mask.movddup.128(<2 x double> %x0, <2 x double> zeroinitializer, i8 %x2)
%res3 = fadd <2 x double> %res, %res1		%res3 = fadd <2 x double> %res, %res1
%res4 = fadd <2 x double> %res2, %res3		%res4 = fadd <2 x double> %res2, %res3
ret <2 x double> %res4		ret <2 x double> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.movddup.256(<4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.movddup.256(<4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_movddup_256(<4 x double> %x0, <4 x double> %x1, i8 %x2) {		define <4 x double>@test_int_x86_avx512_mask_movddup_256(<4 x double> %x0, <4 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_movddup_256:		; CHECK-LABEL: test_int_x86_avx512_mask_movddup_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovddup %ymm0, %ymm2 ## encoding: [0x62,0xf1,0xff,0x28,0x12,0xd0]		; CHECK-NEXT: vmovddup %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xff,0x12,0xd0]
; CHECK-NEXT: ## ymm2 = ymm0[0,0,2,2]		; CHECK-NEXT: ## ymm2 = ymm0[0,0,2,2]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovddup %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0x12,0xc8]		; CHECK-NEXT: vmovddup %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0x12,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,0,2,2]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,0,2,2]
; CHECK-NEXT: vmovddup %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0xa9,0x12,0xc0]		; CHECK-NEXT: vmovddup %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xff,0xa9,0x12,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[0,0,2,2]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[0,0,2,2]
; CHECK-NEXT: vaddpd %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xca]		; CHECK-NEXT: vaddpd %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.movddup.256(<4 x double> %x0, <4 x double> %x1, i8 %x2)		%res = call <4 x double> @llvm.x86.avx512.mask.movddup.256(<4 x double> %x0, <4 x double> %x1, i8 %x2)
%res1 = call <4 x double> @llvm.x86.avx512.mask.movddup.256(<4 x double> %x0, <4 x double> %x1, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.movddup.256(<4 x double> %x0, <4 x double> %x1, i8 -1)
%res2 = call <4 x double> @llvm.x86.avx512.mask.movddup.256(<4 x double> %x0, <4 x double> zeroinitializer, i8 %x2)		%res2 = call <4 x double> @llvm.x86.avx512.mask.movddup.256(<4 x double> %x0, <4 x double> zeroinitializer, i8 %x2)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res2, %res3		%res4 = fadd <4 x double> %res2, %res3
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.vpermil.pd.256(<4 x double>, i32, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.vpermil.pd.256(<4 x double>, i32, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_vpermil_pd_256(<4 x double> %x0, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_vpermil_pd_256(<4 x double> %x0, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermil_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermil_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermilpd $6, %ymm0, %ymm2 ## encoding: [0x62,0xf3,0xfd,0x28,0x05,0xd0,0x06]		; CHECK-NEXT: vpermilpd $6, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x7d,0x05,0xd0,0x06]
; CHECK-NEXT: ## ymm2 = ymm0[0,1,3,2]		; CHECK-NEXT: ## ymm2 = ymm0[0,1,3,2]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermilpd $6, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x05,0xc8,0x06]		; CHECK-NEXT: vpermilpd $6, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x05,0xc8,0x06]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,3,2]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,3,2]
; CHECK-NEXT: vpermilpd $6, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x05,0xc0,0x06]		; CHECK-NEXT: vpermilpd $6, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x05,0xc0,0x06]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[0,1,3,2]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[0,1,3,2]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vpermil.pd.256(<4 x double> %x0, i32 22, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.vpermil.pd.256(<4 x double> %x0, i32 22, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.vpermil.pd.256(<4 x double> %x0, i32 22, <4 x double> zeroinitializer, i8 %x3)		%res1 = call <4 x double> @llvm.x86.avx512.mask.vpermil.pd.256(<4 x double> %x0, i32 22, <4 x double> zeroinitializer, i8 %x3)
%res2 = call <4 x double> @llvm.x86.avx512.mask.vpermil.pd.256(<4 x double> %x0, i32 22, <4 x double> %x2, i8 -1)		%res2 = call <4 x double> @llvm.x86.avx512.mask.vpermil.pd.256(<4 x double> %x0, i32 22, <4 x double> %x2, i8 -1)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res2, %res3		%res4 = fadd <4 x double> %res2, %res3
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <2 x double> @llvm.x86.avx512.mask.vpermil.pd.128(<2 x double>, i32, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.vpermil.pd.128(<2 x double>, i32, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_vpermil_pd_128(<2 x double> %x0, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_vpermil_pd_128(<2 x double> %x0, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermil_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermil_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermilpd $1, %xmm0, %xmm2 ## encoding: [0x62,0xf3,0xfd,0x08,0x05,0xd0,0x01]		; CHECK-NEXT: vpermilpd $1, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x05,0xd0,0x01]
; CHECK-NEXT: ## xmm2 = xmm0[1,0]		; CHECK-NEXT: ## xmm2 = xmm0[1,0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermilpd $1, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x05,0xc8,0x01]		; CHECK-NEXT: vpermilpd $1, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x05,0xc8,0x01]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[1,0]		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[1,0]
; CHECK-NEXT: vpermilpd $1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0x89,0x05,0xc0,0x01]		; CHECK-NEXT: vpermilpd $1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0x89,0x05,0xc0,0x01]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[1,0]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[1,0]
; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
; CHECK-NEXT: vaddpd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x58,0xc2]		; CHECK-NEXT: vaddpd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x58,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vpermil.pd.128(<2 x double> %x0, i32 1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.vpermil.pd.128(<2 x double> %x0, i32 1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.vpermil.pd.128(<2 x double> %x0, i32 1, <2 x double> zeroinitializer, i8 %x3)		%res1 = call <2 x double> @llvm.x86.avx512.mask.vpermil.pd.128(<2 x double> %x0, i32 1, <2 x double> zeroinitializer, i8 %x3)
%res2 = call <2 x double> @llvm.x86.avx512.mask.vpermil.pd.128(<2 x double> %x0, i32 1, <2 x double> %x2, i8 -1)		%res2 = call <2 x double> @llvm.x86.avx512.mask.vpermil.pd.128(<2 x double> %x0, i32 1, <2 x double> %x2, i8 -1)
%res3 = fadd <2 x double> %res, %res1		%res3 = fadd <2 x double> %res, %res1
%res4 = fadd <2 x double> %res3, %res2		%res4 = fadd <2 x double> %res3, %res2
ret <2 x double> %res4		ret <2 x double> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.vpermil.ps.256(<8 x float>, i32, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.vpermil.ps.256(<8 x float>, i32, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_vpermil_ps_256(<8 x float> %x0, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_vpermil_ps_256(<8 x float> %x0, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermil_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermil_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermilps $22, %ymm0, %ymm2 ## encoding: [0x62,0xf3,0x7d,0x28,0x04,0xd0,0x16]		; CHECK-NEXT: vpermilps $22, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x7d,0x04,0xd0,0x16]
; CHECK-NEXT: ## ymm2 = ymm0[2,1,1,0,6,5,5,4]		; CHECK-NEXT: ## ymm2 = ymm0[2,1,1,0,6,5,5,4]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermilps $22, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x04,0xc8,0x16]		; CHECK-NEXT: vpermilps $22, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x04,0xc8,0x16]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[2,1,1,0,6,5,5,4]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[2,1,1,0,6,5,5,4]
; CHECK-NEXT: vpermilps $22, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x04,0xc0,0x16]		; CHECK-NEXT: vpermilps $22, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x04,0xc0,0x16]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[2,1,1,0,6,5,5,4]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[2,1,1,0,6,5,5,4]
; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
; CHECK-NEXT: vaddps %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc2]		; CHECK-NEXT: vaddps %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vpermil.ps.256(<8 x float> %x0, i32 22, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.vpermil.ps.256(<8 x float> %x0, i32 22, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.vpermil.ps.256(<8 x float> %x0, i32 22, <8 x float> zeroinitializer, i8 %x3)		%res1 = call <8 x float> @llvm.x86.avx512.mask.vpermil.ps.256(<8 x float> %x0, i32 22, <8 x float> zeroinitializer, i8 %x3)
%res2 = call <8 x float> @llvm.x86.avx512.mask.vpermil.ps.256(<8 x float> %x0, i32 22, <8 x float> %x2, i8 -1)		%res2 = call <8 x float> @llvm.x86.avx512.mask.vpermil.ps.256(<8 x float> %x0, i32 22, <8 x float> %x2, i8 -1)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res3, %res2		%res4 = fadd <8 x float> %res3, %res2
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <4 x float> @llvm.x86.avx512.mask.vpermil.ps.128(<4 x float>, i32, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.vpermil.ps.128(<4 x float>, i32, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_vpermil_ps_128(<4 x float> %x0, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vpermil_ps_128(<4 x float> %x0, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermil_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermil_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermilps $22, %xmm0, %xmm2 ## encoding: [0x62,0xf3,0x7d,0x08,0x04,0xd0,0x16]		; CHECK-NEXT: vpermilps $22, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x04,0xd0,0x16]
; CHECK-NEXT: ## xmm2 = xmm0[2,1,1,0]		; CHECK-NEXT: ## xmm2 = xmm0[2,1,1,0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermilps $22, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x04,0xc8,0x16]		; CHECK-NEXT: vpermilps $22, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x04,0xc8,0x16]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[2,1,1,0]		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[2,1,1,0]
; CHECK-NEXT: vpermilps $22, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x04,0xc0,0x16]		; CHECK-NEXT: vpermilps $22, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x04,0xc0,0x16]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[2,1,1,0]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[2,1,1,0]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: vaddps %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6c,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe8,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vpermil.ps.128(<4 x float> %x0, i32 22, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vpermil.ps.128(<4 x float> %x0, i32 22, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.vpermil.ps.128(<4 x float> %x0, i32 22, <4 x float> zeroinitializer, i8 %x3)		%res1 = call <4 x float> @llvm.x86.avx512.mask.vpermil.ps.128(<4 x float> %x0, i32 22, <4 x float> zeroinitializer, i8 %x3)
%res2 = call <4 x float> @llvm.x86.avx512.mask.vpermil.ps.128(<4 x float> %x0, i32 22, <4 x float> %x2, i8 -1)		%res2 = call <4 x float> @llvm.x86.avx512.mask.vpermil.ps.128(<4 x float> %x0, i32 22, <4 x float> %x2, i8 -1)
%res3 = fadd <4 x float> %res, %res1		%res3 = fadd <4 x float> %res, %res1
%res4 = fadd <4 x float> %res2, %res3		%res4 = fadd <4 x float> %res2, %res3
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.perm.df.256(<4 x double>, i32, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.perm.df.256(<4 x double>, i32, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_perm_df_256(<4 x double> %x0, i32 %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_perm_df_256(<4 x double> %x0, i32 %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_perm_df_256:		; CHECK-LABEL: test_int_x86_avx512_mask_perm_df_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermpd $3, %ymm0, %ymm2 ## encoding: [0x62,0xf3,0xfd,0x28,0x01,0xd0,0x03]		; CHECK-NEXT: vpermpd $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0xfd,0x01,0xd0,0x03]
; CHECK-NEXT: ## ymm2 = ymm0[3,0,0,0]		; CHECK-NEXT: ## ymm2 = ymm0[3,0,0,0]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpermpd $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x01,0xc8,0x03]		; CHECK-NEXT: vpermpd $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x01,0xc8,0x03]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[3,0,0,0]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[3,0,0,0]
; CHECK-NEXT: vpermpd $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x01,0xc0,0x03]		; CHECK-NEXT: vpermpd $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x01,0xc0,0x03]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[3,0,0,0]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[3,0,0,0]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: vaddpd %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x58,0xc2]		; CHECK-NEXT: vaddpd %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x58,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.perm.df.256(<4 x double> %x0, i32 3, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.perm.df.256(<4 x double> %x0, i32 3, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.perm.df.256(<4 x double> %x0, i32 3, <4 x double> zeroinitializer, i8 %x3)		%res1 = call <4 x double> @llvm.x86.avx512.mask.perm.df.256(<4 x double> %x0, i32 3, <4 x double> zeroinitializer, i8 %x3)
%res2 = call <4 x double> @llvm.x86.avx512.mask.perm.df.256(<4 x double> %x0, i32 3, <4 x double> %x2, i8 -1)		%res2 = call <4 x double> @llvm.x86.avx512.mask.perm.df.256(<4 x double> %x0, i32 3, <4 x double> %x2, i8 -1)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res3, %res2		%res4 = fadd <4 x double> %res3, %res2
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.perm.di.256(<4 x i64>, i32, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.perm.di.256(<4 x i64>, i32, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_perm_di_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_perm_di_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_perm_di_256:		; CHECK-LABEL: test_int_x86_avx512_mask_perm_di_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermq $3, %ymm0, %ymm2 ## encoding: [0x62,0xf3,0xfd,0x28,0x00,0xd0,0x03]		; CHECK-NEXT: vpermq $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0xfd,0x00,0xd0,0x03]
; CHECK-NEXT: ## ymm2 = ymm0[3,0,0,0]		; CHECK-NEXT: ## ymm2 = ymm0[3,0,0,0]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpermq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x00,0xc8,0x03]		; CHECK-NEXT: vpermq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x00,0xc8,0x03]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[3,0,0,0]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[3,0,0,0]
; CHECK-NEXT: vpermq $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x00,0xc0,0x03]		; CHECK-NEXT: vpermq $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x00,0xc0,0x03]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[3,0,0,0]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[3,0,0,0]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc2]		; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.perm.di.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.perm.di.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.perm.di.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.perm.di.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.perm.di.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.perm.di.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare void @llvm.x86.avx512.mask.store.pd.128(i8*, <2 x double>, i8)		declare void @llvm.x86.avx512.mask.store.pd.128(i8*, <2 x double>, i8)

define void@test_int_x86_avx512_mask_store_pd_128(i8* %ptr1, i8* %ptr2, <2 x double> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_store_pd_128(i8* %ptr1, i8* %ptr2, <2 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_store_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_store_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovapd %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x29,0x07]		; CHECK-NEXT: vmovapd %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x29,0x07]
; CHECK-NEXT: vmovapd %xmm0, (%rsi) ## encoding: [0x62,0xf1,0xfd,0x08,0x29,0x06]		; CHECK-NEXT: vmovapd %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x29,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.store.pd.128(i8* %ptr1, <2 x double> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.store.pd.128(i8* %ptr1, <2 x double> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.store.pd.128(i8* %ptr2, <2 x double> %x1, i8 -1)		call void @llvm.x86.avx512.mask.store.pd.128(i8* %ptr2, <2 x double> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.store.pd.256(i8*, <4 x double>, i8)		declare void @llvm.x86.avx512.mask.store.pd.256(i8*, <4 x double>, i8)

define void@test_int_x86_avx512_mask_store_pd_256(i8* %ptr1, i8* %ptr2, <4 x double> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_store_pd_256(i8* %ptr1, i8* %ptr2, <4 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_store_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_store_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovapd %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x29,0x07]		; CHECK-NEXT: vmovapd %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x29,0x07]
; CHECK-NEXT: vmovapd %ymm0, (%rsi) ## encoding: [0x62,0xf1,0xfd,0x28,0x29,0x06]		; CHECK-NEXT: vmovapd %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x29,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.store.pd.256(i8* %ptr1, <4 x double> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.store.pd.256(i8* %ptr1, <4 x double> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.store.pd.256(i8* %ptr2, <4 x double> %x1, i8 -1)		call void @llvm.x86.avx512.mask.store.pd.256(i8* %ptr2, <4 x double> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.pd.128(i8*, <2 x double>, i8)		declare void @llvm.x86.avx512.mask.storeu.pd.128(i8*, <2 x double>, i8)

define void@test_int_x86_avx512_mask_storeu_pd_128(i8* %ptr1, i8* %ptr2, <2 x double> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_pd_128(i8* %ptr1, i8* %ptr2, <2 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovupd %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x11,0x07]		; CHECK-NEXT: vmovupd %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x11,0x07]
; CHECK-NEXT: vmovupd %xmm0, (%rsi) ## encoding: [0x62,0xf1,0xfd,0x08,0x11,0x06]		; CHECK-NEXT: vmovupd %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x11,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.pd.128(i8* %ptr1, <2 x double> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.pd.128(i8* %ptr1, <2 x double> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.pd.128(i8* %ptr2, <2 x double> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.pd.128(i8* %ptr2, <2 x double> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.pd.256(i8*, <4 x double>, i8)		declare void @llvm.x86.avx512.mask.storeu.pd.256(i8*, <4 x double>, i8)

define void@test_int_x86_avx512_mask_storeu_pd_256(i8* %ptr1, i8* %ptr2, <4 x double> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_pd_256(i8* %ptr1, i8* %ptr2, <4 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovupd %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x11,0x07]		; CHECK-NEXT: vmovupd %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x11,0x07]
; CHECK-NEXT: vmovupd %ymm0, (%rsi) ## encoding: [0x62,0xf1,0xfd,0x28,0x11,0x06]		; CHECK-NEXT: vmovupd %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x11,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.pd.256(i8* %ptr1, <4 x double> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.pd.256(i8* %ptr1, <4 x double> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.pd.256(i8* %ptr2, <4 x double> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.pd.256(i8* %ptr2, <4 x double> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.store.ps.128(i8*, <4 x float>, i8)		declare void @llvm.x86.avx512.mask.store.ps.128(i8*, <4 x float>, i8)

define void@test_int_x86_avx512_mask_store_ps_128(i8* %ptr1, i8* %ptr2, <4 x float> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_store_ps_128(i8* %ptr1, i8* %ptr2, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_store_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_store_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovaps %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x29,0x07]		; CHECK-NEXT: vmovaps %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x29,0x07]
; CHECK-NEXT: vmovaps %xmm0, (%rsi) ## encoding: [0x62,0xf1,0x7c,0x08,0x29,0x06]		; CHECK-NEXT: vmovaps %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x29,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.store.ps.128(i8* %ptr1, <4 x float> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.store.ps.128(i8* %ptr1, <4 x float> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.store.ps.128(i8* %ptr2, <4 x float> %x1, i8 -1)		call void @llvm.x86.avx512.mask.store.ps.128(i8* %ptr2, <4 x float> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.store.ps.256(i8*, <8 x float>, i8)		declare void @llvm.x86.avx512.mask.store.ps.256(i8*, <8 x float>, i8)

define void@test_int_x86_avx512_mask_store_ps_256(i8* %ptr1, i8* %ptr2, <8 x float> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_store_ps_256(i8* %ptr1, i8* %ptr2, <8 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_store_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_store_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovaps %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x29,0x07]		; CHECK-NEXT: vmovaps %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x29,0x07]
; CHECK-NEXT: vmovaps %ymm0, (%rsi) ## encoding: [0x62,0xf1,0x7c,0x28,0x29,0x06]		; CHECK-NEXT: vmovaps %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x29,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.store.ps.256(i8* %ptr1, <8 x float> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.store.ps.256(i8* %ptr1, <8 x float> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.store.ps.256(i8* %ptr2, <8 x float> %x1, i8 -1)		call void @llvm.x86.avx512.mask.store.ps.256(i8* %ptr2, <8 x float> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.ps.128(i8*, <4 x float>, i8)		declare void @llvm.x86.avx512.mask.storeu.ps.128(i8*, <4 x float>, i8)

define void@test_int_x86_avx512_mask_storeu_ps_128(i8* %ptr1, i8* %ptr2, <4 x float> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_ps_128(i8* %ptr1, i8* %ptr2, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovups %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x11,0x07]		; CHECK-NEXT: vmovups %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x11,0x07]
; CHECK-NEXT: vmovups %xmm0, (%rsi) ## encoding: [0x62,0xf1,0x7c,0x08,0x11,0x06]		; CHECK-NEXT: vmovups %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x11,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.ps.128(i8* %ptr1, <4 x float> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.ps.128(i8* %ptr1, <4 x float> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.ps.128(i8* %ptr2, <4 x float> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.ps.128(i8* %ptr2, <4 x float> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.ps.256(i8*, <8 x float>, i8)		declare void @llvm.x86.avx512.mask.storeu.ps.256(i8*, <8 x float>, i8)

define void@test_int_x86_avx512_mask_storeu_ps_256(i8* %ptr1, i8* %ptr2, <8 x float> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_ps_256(i8* %ptr1, i8* %ptr2, <8 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovups %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x11,0x07]		; CHECK-NEXT: vmovups %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x11,0x07]
; CHECK-NEXT: vmovups %ymm0, (%rsi) ## encoding: [0x62,0xf1,0x7c,0x28,0x11,0x06]		; CHECK-NEXT: vmovups %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x11,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.ps.256(i8* %ptr1, <8 x float> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.ps.256(i8* %ptr1, <8 x float> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.ps.256(i8* %ptr2, <8 x float> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.ps.256(i8* %ptr2, <8 x float> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.q.128(i8*, <2 x i64>, i8)		declare void @llvm.x86.avx512.mask.storeu.q.128(i8*, <2 x i64>, i8)

define void@test_int_x86_avx512_mask_storeu_q_128(i8* %ptr1, i8* %ptr2, <2 x i64> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_q_128(i8* %ptr1, i8* %ptr2, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu64 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfe,0x09,0x7f,0x07]		; CHECK-NEXT: vmovdqu64 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfe,0x09,0x7f,0x07]
; CHECK-NEXT: vmovdqu64 %xmm0, (%rsi) ## encoding: [0x62,0xf1,0xfe,0x08,0x7f,0x06]		; CHECK-NEXT: vmovdqu %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.q.128(i8* %ptr1, <2 x i64> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.q.128(i8* %ptr1, <2 x i64> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.q.128(i8* %ptr2, <2 x i64> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.q.128(i8* %ptr2, <2 x i64> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.q.256(i8*, <4 x i64>, i8)		declare void @llvm.x86.avx512.mask.storeu.q.256(i8*, <4 x i64>, i8)

define void@test_int_x86_avx512_mask_storeu_q_256(i8* %ptr1, i8* %ptr2, <4 x i64> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_q_256(i8* %ptr1, i8* %ptr2, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu64 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfe,0x29,0x7f,0x07]		; CHECK-NEXT: vmovdqu64 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfe,0x29,0x7f,0x07]
; CHECK-NEXT: vmovdqu64 %ymm0, (%rsi) ## encoding: [0x62,0xf1,0xfe,0x28,0x7f,0x06]		; CHECK-NEXT: vmovdqu %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.q.256(i8* %ptr1, <4 x i64> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.q.256(i8* %ptr1, <4 x i64> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.q.256(i8* %ptr2, <4 x i64> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.q.256(i8* %ptr2, <4 x i64> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.d.128(i8*, <4 x i32>, i8)		declare void @llvm.x86.avx512.mask.storeu.d.128(i8*, <4 x i32>, i8)

define void@test_int_x86_avx512_mask_storeu_d_128(i8* %ptr1, i8* %ptr2, <4 x i32> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_d_128(i8* %ptr1, i8* %ptr2, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu32 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x7f,0x07]		; CHECK-NEXT: vmovdqu32 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x7f,0x07]
; CHECK-NEXT: vmovdqu32 %xmm0, (%rsi) ## encoding: [0x62,0xf1,0x7e,0x08,0x7f,0x06]		; CHECK-NEXT: vmovdqu %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.d.128(i8* %ptr1, <4 x i32> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.d.128(i8* %ptr1, <4 x i32> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.d.128(i8* %ptr2, <4 x i32> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.d.128(i8* %ptr2, <4 x i32> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.storeu.d.256(i8*, <8 x i32>, i8)		declare void @llvm.x86.avx512.mask.storeu.d.256(i8*, <8 x i32>, i8)

define void@test_int_x86_avx512_mask_storeu_d_256(i8* %ptr1, i8* %ptr2, <8 x i32> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_storeu_d_256(i8* %ptr1, i8* %ptr2, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_storeu_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_storeu_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu32 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x7f,0x07]		; CHECK-NEXT: vmovdqu32 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x7f,0x07]
; CHECK-NEXT: vmovdqu32 %ymm0, (%rsi) ## encoding: [0x62,0xf1,0x7e,0x28,0x7f,0x06]		; CHECK-NEXT: vmovdqu %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.storeu.d.256(i8* %ptr1, <8 x i32> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.storeu.d.256(i8* %ptr1, <8 x i32> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.storeu.d.256(i8* %ptr2, <8 x i32> %x1, i8 -1)		call void @llvm.x86.avx512.mask.storeu.d.256(i8* %ptr2, <8 x i32> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.store.q.128(i8*, <2 x i64>, i8)		declare void @llvm.x86.avx512.mask.store.q.128(i8*, <2 x i64>, i8)

define void@test_int_x86_avx512_mask_store_q_128(i8* %ptr1, i8* %ptr2, <2 x i64> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_store_q_128(i8* %ptr1, i8* %ptr2, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_store_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_store_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqa64 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x7f,0x07]		; CHECK-NEXT: vmovdqa64 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x7f,0x07]
; CHECK-NEXT: vmovdqa64 %xmm0, (%rsi) ## encoding: [0x62,0xf1,0xfd,0x08,0x7f,0x06]		; CHECK-NEXT: vmovdqa %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.store.q.128(i8* %ptr1, <2 x i64> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.store.q.128(i8* %ptr1, <2 x i64> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.store.q.128(i8* %ptr2, <2 x i64> %x1, i8 -1)		call void @llvm.x86.avx512.mask.store.q.128(i8* %ptr2, <2 x i64> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.store.q.256(i8*, <4 x i64>, i8)		declare void @llvm.x86.avx512.mask.store.q.256(i8*, <4 x i64>, i8)

define void@test_int_x86_avx512_mask_store_q_256(i8* %ptr1, i8* %ptr2, <4 x i64> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_store_q_256(i8* %ptr1, i8* %ptr2, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_store_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_store_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqa64 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x7f,0x07]		; CHECK-NEXT: vmovdqa64 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x7f,0x07]
; CHECK-NEXT: vmovdqa64 %ymm0, (%rsi) ## encoding: [0x62,0xf1,0xfd,0x28,0x7f,0x06]		; CHECK-NEXT: vmovdqa %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.store.q.256(i8* %ptr1, <4 x i64> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.store.q.256(i8* %ptr1, <4 x i64> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.store.q.256(i8* %ptr2, <4 x i64> %x1, i8 -1)		call void @llvm.x86.avx512.mask.store.q.256(i8* %ptr2, <4 x i64> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.store.d.128(i8*, <4 x i32>, i8)		declare void @llvm.x86.avx512.mask.store.d.128(i8*, <4 x i32>, i8)

define void@test_int_x86_avx512_mask_store_d_128(i8* %ptr1, i8* %ptr2, <4 x i32> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_store_d_128(i8* %ptr1, i8* %ptr2, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_store_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_store_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqa32 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x7f,0x07]		; CHECK-NEXT: vmovdqa32 %xmm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x7f,0x07]
; CHECK-NEXT: vmovdqa32 %xmm0, (%rsi) ## encoding: [0x62,0xf1,0x7d,0x08,0x7f,0x06]		; CHECK-NEXT: vmovdqa %xmm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.store.d.128(i8* %ptr1, <4 x i32> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.store.d.128(i8* %ptr1, <4 x i32> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.store.d.128(i8* %ptr2, <4 x i32> %x1, i8 -1)		call void @llvm.x86.avx512.mask.store.d.128(i8* %ptr2, <4 x i32> %x1, i8 -1)
ret void		ret void
}		}

declare void @llvm.x86.avx512.mask.store.d.256(i8*, <8 x i32>, i8)		declare void @llvm.x86.avx512.mask.store.d.256(i8*, <8 x i32>, i8)

define void@test_int_x86_avx512_mask_store_d_256(i8* %ptr1, i8* %ptr2, <8 x i32> %x1, i8 %x2) {		define void@test_int_x86_avx512_mask_store_d_256(i8* %ptr1, i8* %ptr2, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_store_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_store_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqa32 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x7f,0x07]		; CHECK-NEXT: vmovdqa32 %ymm0, (%rdi) {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x7f,0x07]
; CHECK-NEXT: vmovdqa32 %ymm0, (%rsi) ## encoding: [0x62,0xf1,0x7d,0x28,0x7f,0x06]		; CHECK-NEXT: vmovdqa %ymm0, (%rsi) ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x7f,0x06]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
call void @llvm.x86.avx512.mask.store.d.256(i8* %ptr1, <8 x i32> %x1, i8 %x2)		call void @llvm.x86.avx512.mask.store.d.256(i8* %ptr1, <8 x i32> %x1, i8 %x2)
call void @llvm.x86.avx512.mask.store.d.256(i8* %ptr2, <8 x i32> %x1, i8 -1)		call void @llvm.x86.avx512.mask.store.d.256(i8* %ptr2, <8 x i32> %x1, i8 -1)
ret void		ret void
}		}

define <8 x float> @test_mask_load_aligned_ps_256(<8 x float> %data, i8* %ptr, i8 %mask) {		define <8 x float> @test_mask_load_aligned_ps_256(<8 x float> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_aligned_ps_256:		; CHECK-LABEL: test_mask_load_aligned_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0x07]		; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovaps (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x28,0x07]		; CHECK-NEXT: vmovaps (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x28,0x07]
; CHECK-NEXT: vmovaps (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x28,0x0f]		; CHECK-NEXT: vmovaps (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x28,0x0f]
; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.load.ps.256(i8* %ptr, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.load.ps.256(i8* %ptr, <8 x float> zeroinitializer, i8 -1)
%res1 = call <8 x float> @llvm.x86.avx512.mask.load.ps.256(i8* %ptr, <8 x float> %res, i8 %mask)		%res1 = call <8 x float> @llvm.x86.avx512.mask.load.ps.256(i8* %ptr, <8 x float> %res, i8 %mask)
%res2 = call <8 x float> @llvm.x86.avx512.mask.load.ps.256(i8* %ptr, <8 x float> zeroinitializer, i8 %mask)		%res2 = call <8 x float> @llvm.x86.avx512.mask.load.ps.256(i8* %ptr, <8 x float> zeroinitializer, i8 %mask)
%res4 = fadd <8 x float> %res2, %res1		%res4 = fadd <8 x float> %res2, %res1
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.load.ps.256(i8*, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.load.ps.256(i8*, <8 x float>, i8)

define <8 x float> @test_mask_load_unaligned_ps_256(<8 x float> %data, i8* %ptr, i8 %mask) {		define <8 x float> @test_mask_load_unaligned_ps_256(<8 x float> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_unaligned_ps_256:		; CHECK-LABEL: test_mask_load_unaligned_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovups (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x10,0x07]		; CHECK-NEXT: vmovups (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x10,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovups (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x10,0x07]		; CHECK-NEXT: vmovups (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x10,0x07]
; CHECK-NEXT: vmovups (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x10,0x0f]		; CHECK-NEXT: vmovups (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x10,0x0f]
; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.loadu.ps.256(i8* %ptr, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.loadu.ps.256(i8* %ptr, <8 x float> zeroinitializer, i8 -1)
%res1 = call <8 x float> @llvm.x86.avx512.mask.loadu.ps.256(i8* %ptr, <8 x float> %res, i8 %mask)		%res1 = call <8 x float> @llvm.x86.avx512.mask.loadu.ps.256(i8* %ptr, <8 x float> %res, i8 %mask)
%res2 = call <8 x float> @llvm.x86.avx512.mask.loadu.ps.256(i8* %ptr, <8 x float> zeroinitializer, i8 %mask)		%res2 = call <8 x float> @llvm.x86.avx512.mask.loadu.ps.256(i8* %ptr, <8 x float> zeroinitializer, i8 %mask)
%res4 = fadd <8 x float> %res2, %res1		%res4 = fadd <8 x float> %res2, %res1
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.loadu.ps.256(i8*, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.loadu.ps.256(i8*, <8 x float>, i8)

define <4 x double> @test_mask_load_aligned_pd_256(<4 x double> %data, i8* %ptr, i8 %mask) {		define <4 x double> @test_mask_load_aligned_pd_256(<4 x double> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_aligned_pd_256:		; CHECK-LABEL: test_mask_load_aligned_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovapd (%rdi), %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0x07]		; CHECK-NEXT: vmovapd (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovapd (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x28,0x07]		; CHECK-NEXT: vmovapd (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x28,0x07]
; CHECK-NEXT: vmovapd (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x28,0x0f]		; CHECK-NEXT: vmovapd (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x28,0x0f]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.load.pd.256(i8* %ptr, <4 x double> zeroinitializer, i8 -1)		%res = call <4 x double> @llvm.x86.avx512.mask.load.pd.256(i8* %ptr, <4 x double> zeroinitializer, i8 -1)
%res1 = call <4 x double> @llvm.x86.avx512.mask.load.pd.256(i8* %ptr, <4 x double> %res, i8 %mask)		%res1 = call <4 x double> @llvm.x86.avx512.mask.load.pd.256(i8* %ptr, <4 x double> %res, i8 %mask)
%res2 = call <4 x double> @llvm.x86.avx512.mask.load.pd.256(i8* %ptr, <4 x double> zeroinitializer, i8 %mask)		%res2 = call <4 x double> @llvm.x86.avx512.mask.load.pd.256(i8* %ptr, <4 x double> zeroinitializer, i8 %mask)
%res4 = fadd <4 x double> %res2, %res1		%res4 = fadd <4 x double> %res2, %res1
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.load.pd.256(i8*, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.load.pd.256(i8*, <4 x double>, i8)

define <4 x double> @test_mask_load_unaligned_pd_256(<4 x double> %data, i8* %ptr, i8 %mask) {		define <4 x double> @test_mask_load_unaligned_pd_256(<4 x double> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_unaligned_pd_256:		; CHECK-LABEL: test_mask_load_unaligned_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovupd (%rdi), %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x10,0x07]		; CHECK-NEXT: vmovupd (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x10,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovupd (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x10,0x07]		; CHECK-NEXT: vmovupd (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x10,0x07]
; CHECK-NEXT: vmovupd (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x10,0x0f]		; CHECK-NEXT: vmovupd (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x10,0x0f]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.loadu.pd.256(i8* %ptr, <4 x double> zeroinitializer, i8 -1)		%res = call <4 x double> @llvm.x86.avx512.mask.loadu.pd.256(i8* %ptr, <4 x double> zeroinitializer, i8 -1)
%res1 = call <4 x double> @llvm.x86.avx512.mask.loadu.pd.256(i8* %ptr, <4 x double> %res, i8 %mask)		%res1 = call <4 x double> @llvm.x86.avx512.mask.loadu.pd.256(i8* %ptr, <4 x double> %res, i8 %mask)
%res2 = call <4 x double> @llvm.x86.avx512.mask.loadu.pd.256(i8* %ptr, <4 x double> zeroinitializer, i8 %mask)		%res2 = call <4 x double> @llvm.x86.avx512.mask.loadu.pd.256(i8* %ptr, <4 x double> zeroinitializer, i8 %mask)
%res4 = fadd <4 x double> %res2, %res1		%res4 = fadd <4 x double> %res2, %res1
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.loadu.pd.256(i8*, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.loadu.pd.256(i8*, <4 x double>, i8)

define <4 x float> @test_mask_load_aligned_ps_128(<4 x float> %data, i8* %ptr, i8 %mask) {		define <4 x float> @test_mask_load_aligned_ps_128(<4 x float> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_aligned_ps_128:		; CHECK-LABEL: test_mask_load_aligned_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0x07]		; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovaps (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x28,0x07]		; CHECK-NEXT: vmovaps (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x28,0x07]
; CHECK-NEXT: vmovaps (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x28,0x0f]		; CHECK-NEXT: vmovaps (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x28,0x0f]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.load.ps.128(i8* %ptr, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.load.ps.128(i8* %ptr, <4 x float> zeroinitializer, i8 -1)
%res1 = call <4 x float> @llvm.x86.avx512.mask.load.ps.128(i8* %ptr, <4 x float> %res, i8 %mask)		%res1 = call <4 x float> @llvm.x86.avx512.mask.load.ps.128(i8* %ptr, <4 x float> %res, i8 %mask)
%res2 = call <4 x float> @llvm.x86.avx512.mask.load.ps.128(i8* %ptr, <4 x float> zeroinitializer, i8 %mask)		%res2 = call <4 x float> @llvm.x86.avx512.mask.load.ps.128(i8* %ptr, <4 x float> zeroinitializer, i8 %mask)
%res4 = fadd <4 x float> %res2, %res1		%res4 = fadd <4 x float> %res2, %res1
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <4 x float> @llvm.x86.avx512.mask.load.ps.128(i8*, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.load.ps.128(i8*, <4 x float>, i8)

define <4 x float> @test_mask_load_unaligned_ps_128(<4 x float> %data, i8* %ptr, i8 %mask) {		define <4 x float> @test_mask_load_unaligned_ps_128(<4 x float> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_unaligned_ps_128:		; CHECK-LABEL: test_mask_load_unaligned_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovups (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x10,0x07]		; CHECK-NEXT: vmovups (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x10,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovups (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x10,0x07]		; CHECK-NEXT: vmovups (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x10,0x07]
; CHECK-NEXT: vmovups (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x10,0x0f]		; CHECK-NEXT: vmovups (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x10,0x0f]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.loadu.ps.128(i8* %ptr, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.loadu.ps.128(i8* %ptr, <4 x float> zeroinitializer, i8 -1)
%res1 = call <4 x float> @llvm.x86.avx512.mask.loadu.ps.128(i8* %ptr, <4 x float> %res, i8 %mask)		%res1 = call <4 x float> @llvm.x86.avx512.mask.loadu.ps.128(i8* %ptr, <4 x float> %res, i8 %mask)
%res2 = call <4 x float> @llvm.x86.avx512.mask.loadu.ps.128(i8* %ptr, <4 x float> zeroinitializer, i8 %mask)		%res2 = call <4 x float> @llvm.x86.avx512.mask.loadu.ps.128(i8* %ptr, <4 x float> zeroinitializer, i8 %mask)
%res4 = fadd <4 x float> %res2, %res1		%res4 = fadd <4 x float> %res2, %res1
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <4 x float> @llvm.x86.avx512.mask.loadu.ps.128(i8*, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.loadu.ps.128(i8*, <4 x float>, i8)

define <2 x double> @test_mask_load_aligned_pd_128(<2 x double> %data, i8* %ptr, i8 %mask) {		define <2 x double> @test_mask_load_aligned_pd_128(<2 x double> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_aligned_pd_128:		; CHECK-LABEL: test_mask_load_aligned_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovapd (%rdi), %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0x07]		; CHECK-NEXT: vmovapd (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovapd (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x28,0x07]		; CHECK-NEXT: vmovapd (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x28,0x07]
; CHECK-NEXT: vmovapd (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x28,0x0f]		; CHECK-NEXT: vmovapd (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x28,0x0f]
; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.load.pd.128(i8* %ptr, <2 x double> zeroinitializer, i8 -1)		%res = call <2 x double> @llvm.x86.avx512.mask.load.pd.128(i8* %ptr, <2 x double> zeroinitializer, i8 -1)
%res1 = call <2 x double> @llvm.x86.avx512.mask.load.pd.128(i8* %ptr, <2 x double> %res, i8 %mask)		%res1 = call <2 x double> @llvm.x86.avx512.mask.load.pd.128(i8* %ptr, <2 x double> %res, i8 %mask)
%res2 = call <2 x double> @llvm.x86.avx512.mask.load.pd.128(i8* %ptr, <2 x double> zeroinitializer, i8 %mask)		%res2 = call <2 x double> @llvm.x86.avx512.mask.load.pd.128(i8* %ptr, <2 x double> zeroinitializer, i8 %mask)
%res4 = fadd <2 x double> %res2, %res1		%res4 = fadd <2 x double> %res2, %res1
ret <2 x double> %res4		ret <2 x double> %res4
}		}

declare <2 x double> @llvm.x86.avx512.mask.load.pd.128(i8*, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.load.pd.128(i8*, <2 x double>, i8)

define <2 x double> @test_mask_load_unaligned_pd_128(<2 x double> %data, i8* %ptr, i8 %mask) {		define <2 x double> @test_mask_load_unaligned_pd_128(<2 x double> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_unaligned_pd_128:		; CHECK-LABEL: test_mask_load_unaligned_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovupd (%rdi), %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x10,0x07]		; CHECK-NEXT: vmovupd (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x10,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovupd (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x10,0x07]		; CHECK-NEXT: vmovupd (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x10,0x07]
; CHECK-NEXT: vmovupd (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x10,0x0f]		; CHECK-NEXT: vmovupd (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x10,0x0f]
; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.loadu.pd.128(i8* %ptr, <2 x double> zeroinitializer, i8 -1)		%res = call <2 x double> @llvm.x86.avx512.mask.loadu.pd.128(i8* %ptr, <2 x double> zeroinitializer, i8 -1)
%res1 = call <2 x double> @llvm.x86.avx512.mask.loadu.pd.128(i8* %ptr, <2 x double> %res, i8 %mask)		%res1 = call <2 x double> @llvm.x86.avx512.mask.loadu.pd.128(i8* %ptr, <2 x double> %res, i8 %mask)
%res2 = call <2 x double> @llvm.x86.avx512.mask.loadu.pd.128(i8* %ptr, <2 x double> zeroinitializer, i8 %mask)		%res2 = call <2 x double> @llvm.x86.avx512.mask.loadu.pd.128(i8* %ptr, <2 x double> zeroinitializer, i8 %mask)
%res4 = fadd <2 x double> %res2, %res1		%res4 = fadd <2 x double> %res2, %res1
ret <2 x double> %res4		ret <2 x double> %res4
}		}

declare <2 x double> @llvm.x86.avx512.mask.loadu.pd.128(i8*, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.loadu.pd.128(i8*, <2 x double>, i8)

declare <4 x i32> @llvm.x86.avx512.mask.loadu.d.128(i8*, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.loadu.d.128(i8*, <4 x i32>, i8)

define <4 x i32> @test_mask_load_unaligned_d_128(i8* %ptr, i8* %ptr2, <4 x i32> %data, i8 %mask) {		define <4 x i32> @test_mask_load_unaligned_d_128(i8* %ptr, i8* %ptr2, <4 x i32> %data, i8 %mask) {
; CHECK-LABEL: test_mask_load_unaligned_d_128:		; CHECK-LABEL: test_mask_load_unaligned_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqu32 (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x6f,0x07]		; CHECK-NEXT: vmovdqu (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x07]
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu32 (%rsi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x6f,0x06]		; CHECK-NEXT: vmovdqu32 (%rsi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x6f,0x06]
; CHECK-NEXT: vmovdqu32 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x6f,0x0f]		; CHECK-NEXT: vmovdqu32 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x6f,0x0f]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.loadu.d.128(i8* %ptr, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.loadu.d.128(i8* %ptr, <4 x i32> zeroinitializer, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.loadu.d.128(i8* %ptr2, <4 x i32> %res, i8 %mask)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.loadu.d.128(i8* %ptr2, <4 x i32> %res, i8 %mask)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.loadu.d.128(i8* %ptr, <4 x i32> zeroinitializer, i8 %mask)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.loadu.d.128(i8* %ptr, <4 x i32> zeroinitializer, i8 %mask)
%res4 = add <4 x i32> %res2, %res1		%res4 = add <4 x i32> %res2, %res1
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.loadu.d.256(i8*, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.loadu.d.256(i8*, <8 x i32>, i8)

define <8 x i32> @test_mask_load_unaligned_d_256(i8* %ptr, i8* %ptr2, <8 x i32> %data, i8 %mask) {		define <8 x i32> @test_mask_load_unaligned_d_256(i8* %ptr, i8* %ptr2, <8 x i32> %data, i8 %mask) {
; CHECK-LABEL: test_mask_load_unaligned_d_256:		; CHECK-LABEL: test_mask_load_unaligned_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqu32 (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7e,0x28,0x6f,0x07]		; CHECK-NEXT: vmovdqu (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x6f,0x07]
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu32 (%rsi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x6f,0x06]		; CHECK-NEXT: vmovdqu32 (%rsi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x6f,0x06]
; CHECK-NEXT: vmovdqu32 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x6f,0x0f]		; CHECK-NEXT: vmovdqu32 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x6f,0x0f]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.loadu.d.256(i8* %ptr, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.loadu.d.256(i8* %ptr, <8 x i32> zeroinitializer, i8 -1)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.loadu.d.256(i8* %ptr2, <8 x i32> %res, i8 %mask)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.loadu.d.256(i8* %ptr2, <8 x i32> %res, i8 %mask)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.loadu.d.256(i8* %ptr, <8 x i32> zeroinitializer, i8 %mask)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.loadu.d.256(i8* %ptr, <8 x i32> zeroinitializer, i8 %mask)
%res4 = add <8 x i32> %res2, %res1		%res4 = add <8 x i32> %res2, %res1
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.loadu.q.128(i8*, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.loadu.q.128(i8*, <2 x i64>, i8)

define <2 x i64> @test_mask_load_unaligned_q_128(i8* %ptr, i8* %ptr2, <2 x i64> %data, i8 %mask) {		define <2 x i64> @test_mask_load_unaligned_q_128(i8* %ptr, i8* %ptr2, <2 x i64> %data, i8 %mask) {
; CHECK-LABEL: test_mask_load_unaligned_q_128:		; CHECK-LABEL: test_mask_load_unaligned_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqu64 (%rdi), %xmm0 ## encoding: [0x62,0xf1,0xfe,0x08,0x6f,0x07]		; CHECK-NEXT: vmovdqu (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x07]
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu64 (%rsi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xfe,0x09,0x6f,0x06]		; CHECK-NEXT: vmovdqu64 (%rsi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xfe,0x09,0x6f,0x06]
; CHECK-NEXT: vmovdqu64 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfe,0x89,0x6f,0x0f]		; CHECK-NEXT: vmovdqu64 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfe,0x89,0x6f,0x0f]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.loadu.q.128(i8* %ptr, <2 x i64> zeroinitializer, i8 -1)		%res = call <2 x i64> @llvm.x86.avx512.mask.loadu.q.128(i8* %ptr, <2 x i64> zeroinitializer, i8 -1)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.loadu.q.128(i8* %ptr2, <2 x i64> %res, i8 %mask)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.loadu.q.128(i8* %ptr2, <2 x i64> %res, i8 %mask)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.loadu.q.128(i8* %ptr, <2 x i64> zeroinitializer, i8 %mask)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.loadu.q.128(i8* %ptr, <2 x i64> zeroinitializer, i8 %mask)
%res4 = add <2 x i64> %res2, %res1		%res4 = add <2 x i64> %res2, %res1
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.loadu.q.256(i8*, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.loadu.q.256(i8*, <4 x i64>, i8)

define <4 x i64> @test_mask_load_unaligned_q_256(i8* %ptr, i8* %ptr2, <4 x i64> %data, i8 %mask) {		define <4 x i64> @test_mask_load_unaligned_q_256(i8* %ptr, i8* %ptr2, <4 x i64> %data, i8 %mask) {
; CHECK-LABEL: test_mask_load_unaligned_q_256:		; CHECK-LABEL: test_mask_load_unaligned_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqu64 (%rdi), %ymm0 ## encoding: [0x62,0xf1,0xfe,0x28,0x6f,0x07]		; CHECK-NEXT: vmovdqu (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x6f,0x07]
; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]		; CHECK-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]
; CHECK-NEXT: vmovdqu64 (%rsi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xfe,0x29,0x6f,0x06]		; CHECK-NEXT: vmovdqu64 (%rsi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xfe,0x29,0x6f,0x06]
; CHECK-NEXT: vmovdqu64 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfe,0xa9,0x6f,0x0f]		; CHECK-NEXT: vmovdqu64 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfe,0xa9,0x6f,0x0f]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.loadu.q.256(i8* %ptr, <4 x i64> zeroinitializer, i8 -1)		%res = call <4 x i64> @llvm.x86.avx512.mask.loadu.q.256(i8* %ptr, <4 x i64> zeroinitializer, i8 -1)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.loadu.q.256(i8* %ptr2, <4 x i64> %res, i8 %mask)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.loadu.q.256(i8* %ptr2, <4 x i64> %res, i8 %mask)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.loadu.q.256(i8* %ptr, <4 x i64> zeroinitializer, i8 %mask)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.loadu.q.256(i8* %ptr, <4 x i64> zeroinitializer, i8 %mask)
%res4 = add <4 x i64> %res2, %res1		%res4 = add <4 x i64> %res2, %res1
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.load.d.128(i8*, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.load.d.128(i8*, <4 x i32>, i8)

define <4 x i32> @test_mask_load_aligned_d_128(<4 x i32> %data, i8* %ptr, i8 %mask) {		define <4 x i32> @test_mask_load_aligned_d_128(<4 x i32> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_aligned_d_128:		; CHECK-LABEL: test_mask_load_aligned_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqa32 (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6f,0x07]		; CHECK-NEXT: vmovdqa (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovdqa32 (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x6f,0x07]		; CHECK-NEXT: vmovdqa32 (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x6f,0x07]
; CHECK-NEXT: vmovdqa32 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x6f,0x0f]		; CHECK-NEXT: vmovdqa32 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x6f,0x0f]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.load.d.128(i8* %ptr, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.load.d.128(i8* %ptr, <4 x i32> zeroinitializer, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.load.d.128(i8* %ptr, <4 x i32> %res, i8 %mask)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.load.d.128(i8* %ptr, <4 x i32> %res, i8 %mask)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.load.d.128(i8* %ptr, <4 x i32> zeroinitializer, i8 %mask)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.load.d.128(i8* %ptr, <4 x i32> zeroinitializer, i8 %mask)
%res4 = add <4 x i32> %res2, %res1		%res4 = add <4 x i32> %res2, %res1
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.load.d.256(i8*, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.load.d.256(i8*, <8 x i32>, i8)

define <8 x i32> @test_mask_load_aligned_d_256(<8 x i32> %data, i8* %ptr, i8 %mask) {		define <8 x i32> @test_mask_load_aligned_d_256(<8 x i32> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_aligned_d_256:		; CHECK-LABEL: test_mask_load_aligned_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqa32 (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x6f,0x07]		; CHECK-NEXT: vmovdqa (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovdqa32 (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x6f,0x07]		; CHECK-NEXT: vmovdqa32 (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x6f,0x07]
; CHECK-NEXT: vmovdqa32 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x6f,0x0f]		; CHECK-NEXT: vmovdqa32 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x6f,0x0f]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.load.d.256(i8* %ptr, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.load.d.256(i8* %ptr, <8 x i32> zeroinitializer, i8 -1)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.load.d.256(i8* %ptr, <8 x i32> %res, i8 %mask)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.load.d.256(i8* %ptr, <8 x i32> %res, i8 %mask)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.load.d.256(i8* %ptr, <8 x i32> zeroinitializer, i8 %mask)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.load.d.256(i8* %ptr, <8 x i32> zeroinitializer, i8 %mask)
%res4 = add <8 x i32> %res2, %res1		%res4 = add <8 x i32> %res2, %res1
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.load.q.128(i8*, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.load.q.128(i8*, <2 x i64>, i8)

define <2 x i64> @test_mask_load_aligned_q_128(<2 x i64> %data, i8* %ptr, i8 %mask) {		define <2 x i64> @test_mask_load_aligned_q_128(<2 x i64> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_aligned_q_128:		; CHECK-LABEL: test_mask_load_aligned_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqa64 (%rdi), %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0x07]		; CHECK-NEXT: vmovdqa (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovdqa64 (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x6f,0x07]		; CHECK-NEXT: vmovdqa64 (%rdi), %xmm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x6f,0x07]
; CHECK-NEXT: vmovdqa64 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x6f,0x0f]		; CHECK-NEXT: vmovdqa64 (%rdi), %xmm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x6f,0x0f]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.load.q.128(i8* %ptr, <2 x i64> zeroinitializer, i8 -1)		%res = call <2 x i64> @llvm.x86.avx512.mask.load.q.128(i8* %ptr, <2 x i64> zeroinitializer, i8 -1)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.load.q.128(i8* %ptr, <2 x i64> %res, i8 %mask)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.load.q.128(i8* %ptr, <2 x i64> %res, i8 %mask)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.load.q.128(i8* %ptr, <2 x i64> zeroinitializer, i8 %mask)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.load.q.128(i8* %ptr, <2 x i64> zeroinitializer, i8 %mask)
%res4 = add <2 x i64> %res2, %res1		%res4 = add <2 x i64> %res2, %res1
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.load.q.256(i8*, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.load.q.256(i8*, <4 x i64>, i8)

define <4 x i64> @test_mask_load_aligned_q_256(<4 x i64> %data, i8* %ptr, i8 %mask) {		define <4 x i64> @test_mask_load_aligned_q_256(<4 x i64> %data, i8* %ptr, i8 %mask) {
; CHECK-LABEL: test_mask_load_aligned_q_256:		; CHECK-LABEL: test_mask_load_aligned_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqa64 (%rdi), %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0x07]		; CHECK-NEXT: vmovdqa (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0x07]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vmovdqa64 (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x6f,0x07]		; CHECK-NEXT: vmovdqa64 (%rdi), %ymm0 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x6f,0x07]
; CHECK-NEXT: vmovdqa64 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x6f,0x0f]		; CHECK-NEXT: vmovdqa64 (%rdi), %ymm1 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x6f,0x0f]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.load.q.256(i8* %ptr, <4 x i64> zeroinitializer, i8 -1)		%res = call <4 x i64> @llvm.x86.avx512.mask.load.q.256(i8* %ptr, <4 x i64> zeroinitializer, i8 -1)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.load.q.256(i8* %ptr, <4 x i64> %res, i8 %mask)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.load.q.256(i8* %ptr, <4 x i64> %res, i8 %mask)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.load.q.256(i8* %ptr, <4 x i64> zeroinitializer, i8 %mask)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.load.q.256(i8* %ptr, <4 x i64> zeroinitializer, i8 %mask)
%res4 = add <4 x i64> %res2, %res1		%res4 = add <4 x i64> %res2, %res1
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pshuf.d.128(<4 x i32>, i32, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pshuf.d.128(<4 x i32>, i32, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pshuf_d_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_pshuf_d_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pshuf_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pshuf_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpshufd $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x7d,0x08,0x70,0xd0,0x03]		; CHECK-NEXT: vpshufd $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x70,0xd0,0x03]
; CHECK-NEXT: ## xmm2 = xmm0[3,0,0,0]		; CHECK-NEXT: ## xmm2 = xmm0[3,0,0,0]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpshufd $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x70,0xc8,0x03]		; CHECK-NEXT: vpshufd $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x70,0xc8,0x03]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[3,0,0,0]		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[3,0,0,0]
; CHECK-NEXT: vpshufd $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x70,0xc0,0x03]		; CHECK-NEXT: vpshufd $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x70,0xc0,0x03]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[3,0,0,0]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[3,0,0,0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pshuf.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.pshuf.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pshuf.d.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pshuf.d.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pshuf.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pshuf.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pshuf.d.256(<8 x i32>, i32, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pshuf.d.256(<8 x i32>, i32, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pshuf_d_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_pshuf_d_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pshuf_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pshuf_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpshufd $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x7d,0x28,0x70,0xd0,0x03]		; CHECK-NEXT: vpshufd $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x70,0xd0,0x03]
; CHECK-NEXT: ## ymm2 = ymm0[3,0,0,0,7,4,4,4]		; CHECK-NEXT: ## ymm2 = ymm0[3,0,0,0,7,4,4,4]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpshufd $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x70,0xc8,0x03]		; CHECK-NEXT: vpshufd $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x70,0xc8,0x03]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[3,0,0,0,7,4,4,4]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[3,0,0,0,7,4,4,4]
; CHECK-NEXT: vpshufd $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x70,0xc0,0x03]		; CHECK-NEXT: vpshufd $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x70,0xc0,0x03]
; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[3,0,0,0,7,4,4,4]		; CHECK-NEXT: ## ymm0 {%k1} {z} = ymm0[3,0,0,0,7,4,4,4]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc2]		; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pshuf.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.pshuf.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pshuf.d.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pshuf.d.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.pshuf.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.pshuf.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}
▲ Show 20 Lines • Show All 214 Lines • ▼ Show 20 Lines

declare i8 @llvm.x86.avx512.mask.pcmpgt.q.128(<2 x i64>, <2 x i64>, i8)		declare i8 @llvm.x86.avx512.mask.pcmpgt.q.128(<2 x i64>, <2 x i64>, i8)

declare <2 x double> @llvm.x86.avx512.mask.unpckh.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.unpckh.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_unpckh_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_unpckh_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_unpckh_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_unpckh_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vunpckhpd %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x15,0xd9]		; CHECK-NEXT: vunpckhpd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x15,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[1],xmm1[1]		; CHECK-NEXT: ## xmm3 = xmm0[1],xmm1[1]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vunpckhpd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x15,0xd1]		; CHECK-NEXT: vunpckhpd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x15,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[1],xmm1[1]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[1],xmm1[1]
; CHECK-NEXT: vaddpd %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0x58,0xc3]		; CHECK-NEXT: vaddpd %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.unpckh.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.unpckh.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.unpckh.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.unpckh.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.unpckh.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.unpckh.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_unpckh_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_unpckh_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_unpckh_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_unpckh_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vunpckhpd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x15,0xd9]		; CHECK-NEXT: vunpckhpd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x15,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[1],ymm1[1],ymm0[3],ymm1[3]		; CHECK-NEXT: ## ymm3 = ymm0[1],ymm1[1],ymm0[3],ymm1[3]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vunpckhpd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x15,0xd1]		; CHECK-NEXT: vunpckhpd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x15,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[1],ymm1[1],ymm0[3],ymm1[3]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[1],ymm1[1],ymm0[3],ymm1[3]
; CHECK-NEXT: vaddpd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc3]		; CHECK-NEXT: vaddpd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.unpckh.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.unpckh.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.unpckh.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.unpckh.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.unpckh.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.unpckh.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_unpckh_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_unpckh_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_unpckh_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_unpckh_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vunpckhps %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x15,0xd9]		; CHECK-NEXT: vunpckhps %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x15,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[2],xmm1[2],xmm0[3],xmm1[3]		; CHECK-NEXT: ## xmm3 = xmm0[2],xmm1[2],xmm0[3],xmm1[3]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vunpckhps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x15,0xd1]		; CHECK-NEXT: vunpckhps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x15,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[2],xmm1[2],xmm0[3],xmm1[3]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[2],xmm1[2],xmm0[3],xmm1[3]
; CHECK-NEXT: vaddps %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6c,0x08,0x58,0xc3]		; CHECK-NEXT: vaddps %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe8,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.unpckh.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.unpckh.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.unpckh.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.unpckh.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.unpckh.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.unpckh.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_unpckh_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_unpckh_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_unpckh_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_unpckh_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vunpckhps %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x15,0xd9]		; CHECK-NEXT: vunpckhps %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x15,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]		; CHECK-NEXT: ## ymm3 = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vunpckhps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x15,0xd1]		; CHECK-NEXT: vunpckhps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x15,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]
; CHECK-NEXT: vaddps %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc3]		; CHECK-NEXT: vaddps %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.unpckh.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.unpckh.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.unpckh.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.unpckh.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask.unpckl.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.unpckl.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_unpckl_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_unpckl_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_unpckl_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_unpckl_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vunpcklpd %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x14,0xd9]		; CHECK-NEXT: vunpcklpd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x14,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0]		; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vunpcklpd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x14,0xd1]		; CHECK-NEXT: vunpcklpd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x14,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0]
; CHECK-NEXT: vaddpd %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0x58,0xc3]		; CHECK-NEXT: vaddpd %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.unpckl.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.unpckl.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.unpckl.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.unpckl.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.unpckl.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.unpckl.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_unpckl_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_unpckl_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_unpckl_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_unpckl_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vunpcklpd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x14,0xd9]		; CHECK-NEXT: vunpcklpd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x14,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[2],ymm1[2]		; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[2],ymm1[2]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vunpcklpd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x14,0xd1]		; CHECK-NEXT: vunpcklpd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x14,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[2],ymm1[2]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[2],ymm1[2]
; CHECK-NEXT: vaddpd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc3]		; CHECK-NEXT: vaddpd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.unpckl.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.unpckl.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.unpckl.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.unpckl.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.unpckl.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.unpckl.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_unpckl_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_unpckl_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_unpckl_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_unpckl_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vunpcklps %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x14,0xd9]		; CHECK-NEXT: vunpcklps %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x14,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]		; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vunpcklps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x14,0xd1]		; CHECK-NEXT: vunpcklps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x14,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0],xmm0[1],xmm1[1]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
; CHECK-NEXT: vaddps %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6c,0x08,0x58,0xc3]		; CHECK-NEXT: vaddps %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe8,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.unpckl.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.unpckl.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.unpckl.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.unpckl.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.unpckl.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.unpckl.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_unpckl_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_unpckl_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_unpckl_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_unpckl_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vunpcklps %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x14,0xd9]		; CHECK-NEXT: vunpcklps %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x14,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]		; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vunpcklps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x14,0xd1]		; CHECK-NEXT: vunpcklps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x14,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
; CHECK-NEXT: vaddps %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc3]		; CHECK-NEXT: vaddps %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.unpckl.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.unpckl.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.unpckl.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.unpckl.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.punpckhd.q.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.punpckhd.q.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_punpckhd_q_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_punpckhd_q_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckhd_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckhd_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckhdq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0x6a,0xd9]		; CHECK-NEXT: vpunpckhdq %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6a,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[2],xmm1[2],xmm0[3],xmm1[3]		; CHECK-NEXT: ## xmm3 = xmm0[2],xmm1[2],xmm0[3],xmm1[3]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckhdq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x6a,0xd1]		; CHECK-NEXT: vpunpckhdq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x6a,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[2],xmm1[2],xmm0[3],xmm1[3]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[2],xmm1[2],xmm0[3],xmm1[3]
; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc3]		; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.punpckhd.q.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.punpckhd.q.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.punpckhd.q.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.punpckhd.q.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.punpckld.q.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.punpckld.q.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_punpckld_q_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_punpckld_q_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckld_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckld_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckldq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0x62,0xd9]		; CHECK-NEXT: vpunpckldq %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x62,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]		; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckldq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x62,0xd1]		; CHECK-NEXT: vpunpckldq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x62,0xd1]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0],xmm0[1],xmm1[1]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc3]		; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.punpckld.q.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.punpckld.q.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.punpckld.q.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.punpckld.q.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.punpckhd.q.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.punpckhd.q.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_punpckhd_q_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_punpckhd_q_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckhd_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckhd_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckhdq %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0x6a,0xd9]		; CHECK-NEXT: vpunpckhdq %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6a,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]		; CHECK-NEXT: ## ymm3 = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckhdq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x6a,0xd1]		; CHECK-NEXT: vpunpckhdq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x6a,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.punpckhd.q.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.punpckhd.q.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.punpckhd.q.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.punpckhd.q.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.punpckld.q.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.punpckld.q.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_punpckld_q_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_punpckld_q_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckld_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckld_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckldq %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0x62,0xd9]		; CHECK-NEXT: vpunpckldq %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x62,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]		; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckldq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x62,0xd1]		; CHECK-NEXT: vpunpckldq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x62,0xd1]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.punpckld.q.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.punpckld.q.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.punpckld.q.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.punpckld.q.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.punpckhqd.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.punpckhqd.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_punpckhqd_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_punpckhqd_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckhqd_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckhqd_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckhqdq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6d,0xd9]		; CHECK-NEXT: vpunpckhqdq %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6d,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[1],xmm1[1]		; CHECK-NEXT: ## xmm3 = xmm0[1],xmm1[1]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckhqdq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x6d,0xd1]		; CHECK-NEXT: vpunpckhqdq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x6d,0xd1]
; CHECK-NEXT: ## xmm2 = xmm0[1],xmm1[1]		; CHECK-NEXT: ## xmm2 = xmm0[1],xmm1[1]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.punpckhqd.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.punpckhqd.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.punpckhqd.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.punpckhqd.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.punpcklqd.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.punpcklqd.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_punpcklqd_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_punpcklqd_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpcklqd_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_punpcklqd_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpcklqdq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6c,0xd9]		; CHECK-NEXT: vpunpcklqdq %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6c,0xd9]
; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0]		; CHECK-NEXT: ## xmm3 = xmm0[0],xmm1[0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpcklqdq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x6c,0xd1]		; CHECK-NEXT: vpunpcklqdq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x6c,0xd1]
; CHECK-NEXT: ## xmm2 = xmm0[0],xmm1[0]		; CHECK-NEXT: ## xmm2 = xmm0[0],xmm1[0]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.punpcklqd.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.punpcklqd.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.punpcklqd.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.punpcklqd.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.punpcklqd.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.punpcklqd.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_punpcklqd_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_punpcklqd_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpcklqd_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_punpcklqd_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpcklqdq %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6c,0xd9]		; CHECK-NEXT: vpunpcklqdq %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6c,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[2],ymm1[2]		; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[0],ymm0[2],ymm1[2]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpcklqdq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x6c,0xd1]		; CHECK-NEXT: vpunpcklqdq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x6c,0xd1]
; CHECK-NEXT: ## ymm2 = ymm0[0],ymm1[0],ymm0[2],ymm1[2]		; CHECK-NEXT: ## ymm2 = ymm0[0],ymm1[0],ymm0[2],ymm1[2]
; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc3]		; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.punpcklqd.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.punpcklqd.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.punpcklqd.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.punpcklqd.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.punpckhqd.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.punpckhqd.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_punpckhqd_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_punpckhqd_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_punpckhqd_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_punpckhqd_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpunpckhqdq %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6d,0xd9]		; CHECK-NEXT: vpunpckhqdq %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6d,0xd9]
; CHECK-NEXT: ## ymm3 = ymm0[1],ymm1[1],ymm0[3],ymm1[3]		; CHECK-NEXT: ## ymm3 = ymm0[1],ymm1[1],ymm0[3],ymm1[3]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpunpckhqdq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x6d,0xd1]		; CHECK-NEXT: vpunpckhqdq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x6d,0xd1]
; CHECK-NEXT: ## ymm2 = ymm0[1],ymm1[1],ymm0[3],ymm1[3]		; CHECK-NEXT: ## ymm2 = ymm0[1],ymm1[1],ymm0[3],ymm1[3]
; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc3]		; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.punpckhqd.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.punpckhqd.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.punpckhqd.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.punpckhqd.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

define <4 x i32> @test_mask_and_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {		define <4 x i32> @test_mask_and_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: test_mask_and_epi32_rr_128:		; CHECK-LABEL: test_mask_and_epi32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpandd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdb,0xc1]		; CHECK-NEXT: vpand %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdb,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_and_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_and_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rrk_128:		; CHECK-LABEL: test_mask_and_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpandd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdb,0xd1]		; CHECK-NEXT: vpandd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdb,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_and_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {		define <4 x i32> @test_mask_and_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rrkz_128:		; CHECK-LABEL: test_mask_and_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpandd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdb,0xc1]		; CHECK-NEXT: vpandd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xdb,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_and_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {		define <4 x i32> @test_mask_and_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_and_epi32_rm_128:		; CHECK-LABEL: test_mask_and_epi32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpandd (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdb,0x07]		; CHECK-NEXT: vpand (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdb,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_and_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_and_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rmk_128:		; CHECK-LABEL: test_mask_and_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandd (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdb,0x0f]		; CHECK-NEXT: vpandd (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdb,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_and_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {		define <4 x i32> @test_mask_and_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rmkz_128:		; CHECK-LABEL: test_mask_and_epi32_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_and_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_and_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rmbk_128:		; CHECK-LABEL: test_mask_and_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandd (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xdb,0x0f]		; CHECK-NEXT: vpandd (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xdb,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0
%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer
%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pand.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <8 x i32> @test_mask_and_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {		define <8 x i32> @test_mask_and_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {
; CHECK-LABEL: test_mask_and_epi32_rr_256:		; CHECK-LABEL: test_mask_and_epi32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpandd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xdb,0xc1]		; CHECK-NEXT: vpand %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xdb,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_and_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_and_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rrk_256:		; CHECK-LABEL: test_mask_and_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpandd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdb,0xd1]		; CHECK-NEXT: vpandd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdb,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_and_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {		define <8 x i32> @test_mask_and_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rrkz_256:		; CHECK-LABEL: test_mask_and_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpandd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdb,0xc1]		; CHECK-NEXT: vpandd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xdb,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_and_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {		define <8 x i32> @test_mask_and_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_and_epi32_rm_256:		; CHECK-LABEL: test_mask_and_epi32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpandd (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xdb,0x07]		; CHECK-NEXT: vpand (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xdb,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_and_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_and_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rmk_256:		; CHECK-LABEL: test_mask_and_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandd (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdb,0x0f]		; CHECK-NEXT: vpandd (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdb,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_and_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {		define <8 x i32> @test_mask_and_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rmkz_256:		; CHECK-LABEL: test_mask_and_epi32_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_and_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_and_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_and_epi32_rmbk_256:		; CHECK-LABEL: test_mask_and_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandd (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xdb,0x0f]		; CHECK-NEXT: vpandd (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xdb,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0
%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer
%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pand.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <4 x i32> @test_mask_or_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {		define <4 x i32> @test_mask_or_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: test_mask_or_epi32_rr_128:		; CHECK-LABEL: test_mask_or_epi32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpord %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xeb,0xc1]		; CHECK-NEXT: vpor %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xeb,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_or_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_or_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rrk_128:		; CHECK-LABEL: test_mask_or_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpord %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xeb,0xd1]		; CHECK-NEXT: vpord %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xeb,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_or_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {		define <4 x i32> @test_mask_or_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rrkz_128:		; CHECK-LABEL: test_mask_or_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpord %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xeb,0xc1]		; CHECK-NEXT: vpord %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xeb,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_or_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {		define <4 x i32> @test_mask_or_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_or_epi32_rm_128:		; CHECK-LABEL: test_mask_or_epi32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpord (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xeb,0x07]		; CHECK-NEXT: vpor (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xeb,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_or_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_or_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rmk_128:		; CHECK-LABEL: test_mask_or_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpord (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xeb,0x0f]		; CHECK-NEXT: vpord (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xeb,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_or_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {		define <4 x i32> @test_mask_or_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rmkz_128:		; CHECK-LABEL: test_mask_or_epi32_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_or_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_or_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rmbk_128:		; CHECK-LABEL: test_mask_or_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpord (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xeb,0x0f]		; CHECK-NEXT: vpord (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xeb,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0
%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer
%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

declare <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.por.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <8 x i32> @test_mask_or_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {		define <8 x i32> @test_mask_or_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {
; CHECK-LABEL: test_mask_or_epi32_rr_256:		; CHECK-LABEL: test_mask_or_epi32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpord %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xeb,0xc1]		; CHECK-NEXT: vpor %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xeb,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_or_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_or_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rrk_256:		; CHECK-LABEL: test_mask_or_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpord %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xeb,0xd1]		; CHECK-NEXT: vpord %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xeb,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_or_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {		define <8 x i32> @test_mask_or_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rrkz_256:		; CHECK-LABEL: test_mask_or_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpord %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xeb,0xc1]		; CHECK-NEXT: vpord %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xeb,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_or_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {		define <8 x i32> @test_mask_or_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_or_epi32_rm_256:		; CHECK-LABEL: test_mask_or_epi32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpord (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xeb,0x07]		; CHECK-NEXT: vpor (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xeb,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_or_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_or_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rmk_256:		; CHECK-LABEL: test_mask_or_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpord (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xeb,0x0f]		; CHECK-NEXT: vpord (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xeb,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_or_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {		define <8 x i32> @test_mask_or_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rmkz_256:		; CHECK-LABEL: test_mask_or_epi32_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_or_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_or_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_or_epi32_rmbk_256:		; CHECK-LABEL: test_mask_or_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpord (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xeb,0x0f]		; CHECK-NEXT: vpord (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xeb,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0
%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer
%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

declare <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.por.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <4 x i32> @test_mask_xor_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {		define <4 x i32> @test_mask_xor_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: test_mask_xor_epi32_rr_128:		; CHECK-LABEL: test_mask_xor_epi32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpxord %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xef,0xc1]		; CHECK-NEXT: vpxor %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xef,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_xor_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_xor_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rrk_128:		; CHECK-LABEL: test_mask_xor_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpxord %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xef,0xd1]		; CHECK-NEXT: vpxord %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xef,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_xor_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {		define <4 x i32> @test_mask_xor_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rrkz_128:		; CHECK-LABEL: test_mask_xor_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpxord %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xef,0xc1]		; CHECK-NEXT: vpxord %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xef,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_xor_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {		define <4 x i32> @test_mask_xor_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_xor_epi32_rm_128:		; CHECK-LABEL: test_mask_xor_epi32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpxord (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xef,0x07]		; CHECK-NEXT: vpxor (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xef,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_xor_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_xor_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rmk_128:		; CHECK-LABEL: test_mask_xor_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpxord (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xef,0x0f]		; CHECK-NEXT: vpxord (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xef,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_xor_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {		define <4 x i32> @test_mask_xor_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rmkz_128:		; CHECK-LABEL: test_mask_xor_epi32_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_xor_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_xor_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rmbk_128:		; CHECK-LABEL: test_mask_xor_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpxord (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xef,0x0f]		; CHECK-NEXT: vpxord (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xef,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0
%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer
%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pxor.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <8 x i32> @test_mask_xor_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {		define <8 x i32> @test_mask_xor_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {
; CHECK-LABEL: test_mask_xor_epi32_rr_256:		; CHECK-LABEL: test_mask_xor_epi32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpxord %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xef,0xc1]		; CHECK-NEXT: vpxor %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xef,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_xor_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_xor_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rrk_256:		; CHECK-LABEL: test_mask_xor_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpxord %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xef,0xd1]		; CHECK-NEXT: vpxord %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xef,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_xor_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {		define <8 x i32> @test_mask_xor_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rrkz_256:		; CHECK-LABEL: test_mask_xor_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpxord %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xef,0xc1]		; CHECK-NEXT: vpxord %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xef,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_xor_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {		define <8 x i32> @test_mask_xor_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_xor_epi32_rm_256:		; CHECK-LABEL: test_mask_xor_epi32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpxord (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xef,0x07]		; CHECK-NEXT: vpxor (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xef,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_xor_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_xor_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rmk_256:		; CHECK-LABEL: test_mask_xor_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpxord (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xef,0x0f]		; CHECK-NEXT: vpxord (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xef,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_xor_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {		define <8 x i32> @test_mask_xor_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rmkz_256:		; CHECK-LABEL: test_mask_xor_epi32_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_xor_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_xor_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_xor_epi32_rmbk_256:		; CHECK-LABEL: test_mask_xor_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpxord (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xef,0x0f]		; CHECK-NEXT: vpxord (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xef,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0
%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer
%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pxor.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

Show All 21 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_andnot_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_andnot_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rrk_128:		; CHECK-LABEL: test_mask_andnot_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpandnd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdf,0xd1]		; CHECK-NEXT: vpandnd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdf,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pandn.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pandn.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_andnot_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {		define <4 x i32> @test_mask_andnot_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rrkz_128:		; CHECK-LABEL: test_mask_andnot_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 14 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_andnot_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_andnot_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rmk_128:		; CHECK-LABEL: test_mask_andnot_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandnd (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdf,0x0f]		; CHECK-NEXT: vpandnd (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xdf,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.pandn.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pandn.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_andnot_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {		define <4 x i32> @test_mask_andnot_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rmkz_128:		; CHECK-LABEL: test_mask_andnot_epi32_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_andnot_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_andnot_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rmbk_128:		; CHECK-LABEL: test_mask_andnot_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandnd (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xdf,0x0f]		; CHECK-NEXT: vpandnd (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xdf,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0
%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer
%res = call <4 x i32> @llvm.x86.avx512.mask.pandn.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pandn.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

Show All 21 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_andnot_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_andnot_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rrk_256:		; CHECK-LABEL: test_mask_andnot_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpandnd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdf,0xd1]		; CHECK-NEXT: vpandnd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdf,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pandn.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pandn.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_andnot_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {		define <8 x i32> @test_mask_andnot_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rrkz_256:		; CHECK-LABEL: test_mask_andnot_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 14 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_andnot_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_andnot_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rmk_256:		; CHECK-LABEL: test_mask_andnot_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandnd (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdf,0x0f]		; CHECK-NEXT: vpandnd (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xdf,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.pandn.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pandn.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_andnot_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {		define <8 x i32> @test_mask_andnot_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rmkz_256:		; CHECK-LABEL: test_mask_andnot_epi32_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_andnot_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_andnot_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi32_rmbk_256:		; CHECK-LABEL: test_mask_andnot_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandnd (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xdf,0x0f]		; CHECK-NEXT: vpandnd (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xdf,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0
%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer
%res = call <8 x i32> @llvm.x86.avx512.mask.pandn.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.pandn.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

Show All 21 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_andnot_epi64_rrk_128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask) {		define <2 x i64> @test_mask_andnot_epi64_rrk_128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rrk_128:		; CHECK-LABEL: test_mask_andnot_epi64_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpandnq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xdf,0xd1]		; CHECK-NEXT: vpandnq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xdf,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pandn.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)		%res = call <2 x i64> @llvm.x86.avx512.mask.pandn.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_andnot_epi64_rrkz_128(<2 x i64> %a, <2 x i64> %b, i8 %mask) {		define <2 x i64> @test_mask_andnot_epi64_rrkz_128(<2 x i64> %a, <2 x i64> %b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rrkz_128:		; CHECK-LABEL: test_mask_andnot_epi64_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 14 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_andnot_epi64_rmk_128(<2 x i64> %a, <2 x i64>* %ptr_b, <2 x i64> %passThru, i8 %mask) {		define <2 x i64> @test_mask_andnot_epi64_rmk_128(<2 x i64> %a, <2 x i64>* %ptr_b, <2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rmk_128:		; CHECK-LABEL: test_mask_andnot_epi64_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandnq (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xdf,0x0f]		; CHECK-NEXT: vpandnq (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xdf,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <2 x i64>, <2 x i64>* %ptr_b		%b = load <2 x i64>, <2 x i64>* %ptr_b
%res = call <2 x i64> @llvm.x86.avx512.mask.pandn.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)		%res = call <2 x i64> @llvm.x86.avx512.mask.pandn.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_andnot_epi64_rmkz_128(<2 x i64> %a, <2 x i64>* %ptr_b, i8 %mask) {		define <2 x i64> @test_mask_andnot_epi64_rmkz_128(<2 x i64> %a, <2 x i64>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rmkz_128:		; CHECK-LABEL: test_mask_andnot_epi64_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_mask_andnot_epi64_rmbk_128(<2 x i64> %a, i64* %ptr_b, <2 x i64> %passThru, i8 %mask) {		define <2 x i64> @test_mask_andnot_epi64_rmbk_128(<2 x i64> %a, i64* %ptr_b, <2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rmbk_128:		; CHECK-LABEL: test_mask_andnot_epi64_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandnq (%rdi){1to2}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x19,0xdf,0x0f]		; CHECK-NEXT: vpandnq (%rdi){1to2}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x19,0xdf,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i64, i64* %ptr_b		%q = load i64, i64* %ptr_b
%vecinit.i = insertelement <2 x i64> undef, i64 %q, i32 0		%vecinit.i = insertelement <2 x i64> undef, i64 %q, i32 0
%b = shufflevector <2 x i64> %vecinit.i, <2 x i64> undef, <2 x i32> zeroinitializer		%b = shufflevector <2 x i64> %vecinit.i, <2 x i64> undef, <2 x i32> zeroinitializer
%res = call <2 x i64> @llvm.x86.avx512.mask.pandn.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)		%res = call <2 x i64> @llvm.x86.avx512.mask.pandn.q.128(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passThru, i8 %mask)
ret <2 x i64> %res		ret <2 x i64> %res
}		}

Show All 21 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_andnot_epi64_rrk_256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask) {		define <4 x i64> @test_mask_andnot_epi64_rrk_256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rrk_256:		; CHECK-LABEL: test_mask_andnot_epi64_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpandnq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xdf,0xd1]		; CHECK-NEXT: vpandnq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xdf,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pandn.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pandn.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_andnot_epi64_rrkz_256(<4 x i64> %a, <4 x i64> %b, i8 %mask) {		define <4 x i64> @test_mask_andnot_epi64_rrkz_256(<4 x i64> %a, <4 x i64> %b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rrkz_256:		; CHECK-LABEL: test_mask_andnot_epi64_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 14 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_andnot_epi64_rmk_256(<4 x i64> %a, <4 x i64>* %ptr_b, <4 x i64> %passThru, i8 %mask) {		define <4 x i64> @test_mask_andnot_epi64_rmk_256(<4 x i64> %a, <4 x i64>* %ptr_b, <4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rmk_256:		; CHECK-LABEL: test_mask_andnot_epi64_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandnq (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xdf,0x0f]		; CHECK-NEXT: vpandnq (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xdf,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i64>, <4 x i64>* %ptr_b		%b = load <4 x i64>, <4 x i64>* %ptr_b
%res = call <4 x i64> @llvm.x86.avx512.mask.pandn.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pandn.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_andnot_epi64_rmkz_256(<4 x i64> %a, <4 x i64>* %ptr_b, i8 %mask) {		define <4 x i64> @test_mask_andnot_epi64_rmkz_256(<4 x i64> %a, <4 x i64>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rmkz_256:		; CHECK-LABEL: test_mask_andnot_epi64_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_mask_andnot_epi64_rmbk_256(<4 x i64> %a, i64* %ptr_b, <4 x i64> %passThru, i8 %mask) {		define <4 x i64> @test_mask_andnot_epi64_rmbk_256(<4 x i64> %a, i64* %ptr_b, <4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_andnot_epi64_rmbk_256:		; CHECK-LABEL: test_mask_andnot_epi64_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpandnq (%rdi){1to4}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x39,0xdf,0x0f]		; CHECK-NEXT: vpandnq (%rdi){1to4}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x39,0xdf,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i64, i64* %ptr_b		%q = load i64, i64* %ptr_b
%vecinit.i = insertelement <4 x i64> undef, i64 %q, i32 0		%vecinit.i = insertelement <4 x i64> undef, i64 %q, i32 0
%b = shufflevector <4 x i64> %vecinit.i, <4 x i64> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i64> %vecinit.i, <4 x i64> undef, <4 x i32> zeroinitializer
%res = call <4 x i64> @llvm.x86.avx512.mask.pandn.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pandn.q.256(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passThru, i8 %mask)
ret <4 x i64> %res		ret <4 x i64> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i64> %res		ret <4 x i64> %res
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pandn.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pandn.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i32> @test_mask_add_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {		define <4 x i32> @test_mask_add_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: test_mask_add_epi32_rr_128:		; CHECK-LABEL: test_mask_add_epi32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_add_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_add_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rrk_128:		; CHECK-LABEL: test_mask_add_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfe,0xd1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfe,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_add_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {		define <4 x i32> @test_mask_add_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rrkz_128:		; CHECK-LABEL: test_mask_add_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_add_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {		define <4 x i32> @test_mask_add_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_add_epi32_rm_128:		; CHECK-LABEL: test_mask_add_epi32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddd (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0x07]		; CHECK-NEXT: vpaddd (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_add_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_add_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rmk_128:		; CHECK-LABEL: test_mask_add_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddd (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfe,0x0f]		; CHECK-NEXT: vpaddd (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfe,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_add_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {		define <4 x i32> @test_mask_add_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rmkz_128:		; CHECK-LABEL: test_mask_add_epi32_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_add_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_add_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rmbk_128:		; CHECK-LABEL: test_mask_add_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddd (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xfe,0x0f]		; CHECK-NEXT: vpaddd (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xfe,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0
%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer
%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

declare <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.padd.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32> @test_mask_sub_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {		define <4 x i32> @test_mask_sub_epi32_rr_128(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: test_mask_sub_epi32_rr_128:		; CHECK-LABEL: test_mask_sub_epi32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfa,0xc1]		; CHECK-NEXT: vpsubd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfa,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_sub_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_sub_epi32_rrk_128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rrk_128:		; CHECK-LABEL: test_mask_sub_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfa,0xd1]		; CHECK-NEXT: vpsubd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfa,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_sub_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {		define <4 x i32> @test_mask_sub_epi32_rrkz_128(<4 x i32> %a, <4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rrkz_128:		; CHECK-LABEL: test_mask_sub_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xfa,0xc1]		; CHECK-NEXT: vpsubd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xfa,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_sub_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {		define <4 x i32> @test_mask_sub_epi32_rm_128(<4 x i32> %a, <4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_sub_epi32_rm_128:		; CHECK-LABEL: test_mask_sub_epi32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubd (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfa,0x07]		; CHECK-NEXT: vpsubd (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfa,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> zeroinitializer, i8 -1)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_sub_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_sub_epi32_rmk_128(<4 x i32> %a, <4 x i32>* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rmk_128:		; CHECK-LABEL: test_mask_sub_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubd (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfa,0x0f]		; CHECK-NEXT: vpsubd (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xfa,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <4 x i32>, <4 x i32>* %ptr_b		%b = load <4 x i32>, <4 x i32>* %ptr_b
%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_sub_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {		define <4 x i32> @test_mask_sub_epi32_rmkz_128(<4 x i32> %a, <4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rmkz_128:		; CHECK-LABEL: test_mask_sub_epi32_rmkz_128:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @test_mask_sub_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {		define <4 x i32> @test_mask_sub_epi32_rmbk_128(<4 x i32> %a, i32* %ptr_b, <4 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rmbk_128:		; CHECK-LABEL: test_mask_sub_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubd (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xfa,0x0f]		; CHECK-NEXT: vpsubd (%rdi){1to4}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x19,0xfa,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <4 x i32> undef, i32 %q, i32 0
%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer		%b = shufflevector <4 x i32> %vecinit.i, <4 x i32> undef, <4 x i32> zeroinitializer
%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passThru, i8 %mask)
ret <4 x i32> %res		ret <4 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i32> %res		ret <4 x i32> %res
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psub.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <8 x i32> @test_mask_sub_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {		define <8 x i32> @test_mask_sub_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {
; CHECK-LABEL: test_mask_sub_epi32_rr_256:		; CHECK-LABEL: test_mask_sub_epi32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfa,0xc1]		; CHECK-NEXT: vpsubd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfa,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_sub_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_sub_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rrk_256:		; CHECK-LABEL: test_mask_sub_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfa,0xd1]		; CHECK-NEXT: vpsubd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfa,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_sub_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {		define <8 x i32> @test_mask_sub_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rrkz_256:		; CHECK-LABEL: test_mask_sub_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsubd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xfa,0xc1]		; CHECK-NEXT: vpsubd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xfa,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_sub_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {		define <8 x i32> @test_mask_sub_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_sub_epi32_rm_256:		; CHECK-LABEL: test_mask_sub_epi32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsubd (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfa,0x07]		; CHECK-NEXT: vpsubd (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfa,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_sub_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_sub_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rmk_256:		; CHECK-LABEL: test_mask_sub_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubd (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfa,0x0f]		; CHECK-NEXT: vpsubd (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfa,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_sub_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {		define <8 x i32> @test_mask_sub_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rmkz_256:		; CHECK-LABEL: test_mask_sub_epi32_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_sub_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_sub_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_sub_epi32_rmbk_256:		; CHECK-LABEL: test_mask_sub_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsubd (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xfa,0x0f]		; CHECK-NEXT: vpsubd (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xfa,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0
%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer
%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psub.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32> @test_mask_add_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {		define <8 x i32> @test_mask_add_epi32_rr_256(<8 x i32> %a, <8 x i32> %b) {
; CHECK-LABEL: test_mask_add_epi32_rr_256:		; CHECK-LABEL: test_mask_add_epi32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc1]		; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_add_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_add_epi32_rrk_256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rrk_256:		; CHECK-LABEL: test_mask_add_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfe,0xd1]		; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfe,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_add_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {		define <8 x i32> @test_mask_add_epi32_rrkz_256(<8 x i32> %a, <8 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rrkz_256:		; CHECK-LABEL: test_mask_add_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xfe,0xc1]		; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_add_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {		define <8 x i32> @test_mask_add_epi32_rm_256(<8 x i32> %a, <8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_add_epi32_rm_256:		; CHECK-LABEL: test_mask_add_epi32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpaddd (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0x07]		; CHECK-NEXT: vpaddd (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_add_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_add_epi32_rmk_256(<8 x i32> %a, <8 x i32>* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rmk_256:		; CHECK-LABEL: test_mask_add_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddd (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfe,0x0f]		; CHECK-NEXT: vpaddd (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xfe,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load <8 x i32>, <8 x i32>* %ptr_b		%b = load <8 x i32>, <8 x i32>* %ptr_b
%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_add_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {		define <8 x i32> @test_mask_add_epi32_rmkz_256(<8 x i32> %a, <8 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rmkz_256:		; CHECK-LABEL: test_mask_add_epi32_rmkz_256:
Show All 18 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @test_mask_add_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {		define <8 x i32> @test_mask_add_epi32_rmbk_256(<8 x i32> %a, i32* %ptr_b, <8 x i32> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_add_epi32_rmbk_256:		; CHECK-LABEL: test_mask_add_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpaddd (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xfe,0x0f]		; CHECK-NEXT: vpaddd (%rdi){1to8}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x39,0xfe,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i32, i32* %ptr_b		%q = load i32, i32* %ptr_b
%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0		%vecinit.i = insertelement <8 x i32> undef, i32 %q, i32 0
%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer		%b = shufflevector <8 x i32> %vecinit.i, <8 x i32> undef, <8 x i32> zeroinitializer
%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)		%res = call <8 x i32> @llvm.x86.avx512.mask.padd.d.256(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passThru, i8 %mask)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

Show All 22 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_mask_add_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {		define <8 x float> @test_mm512_mask_add_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_add_ps_256:		; CHECK-LABEL: test_mm512_mask_add_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x58,0xd1]		; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x58,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.add.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.add.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_add_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_add_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_add_ps_256:		; CHECK-LABEL: test_mm512_add_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.add.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.add.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}
declare <8 x float> @llvm.x86.avx512.mask.add.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.add.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <4 x float> @test_mm512_maskz_add_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_maskz_add_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_add_ps_128:		; CHECK-LABEL: test_mm512_maskz_add_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.add.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.add.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_mask_add_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {		define <4 x float> @test_mm512_mask_add_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_add_ps_128:		; CHECK-LABEL: test_mm512_mask_add_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x58,0xd1]		; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x58,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.add.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.add.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_add_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_add_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_add_ps_128:		; CHECK-LABEL: test_mm512_add_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.add.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.add.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}
declare <4 x float> @llvm.x86.avx512.mask.add.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.add.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <8 x float> @test_mm512_maskz_sub_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_maskz_sub_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_sub_ps_256:		; CHECK-LABEL: test_mm512_maskz_sub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vsubps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x5c,0xc1]		; CHECK-NEXT: vsubps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x5c,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.sub.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.sub.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_mask_sub_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {		define <8 x float> @test_mm512_mask_sub_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_sub_ps_256:		; CHECK-LABEL: test_mm512_mask_sub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vsubps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5c,0xd1]		; CHECK-NEXT: vsubps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5c,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.sub.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.sub.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_sub_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_sub_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_sub_ps_256:		; CHECK-LABEL: test_mm512_sub_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vsubps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5c,0xc1]		; CHECK-NEXT: vsubps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5c,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.sub.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.sub.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}
declare <8 x float> @llvm.x86.avx512.mask.sub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.sub.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <4 x float> @test_mm512_maskz_sub_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_maskz_sub_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_sub_ps_128:		; CHECK-LABEL: test_mm512_maskz_sub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vsubps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x5c,0xc1]		; CHECK-NEXT: vsubps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x5c,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.sub.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.sub.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_mask_sub_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {		define <4 x float> @test_mm512_mask_sub_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_sub_ps_128:		; CHECK-LABEL: test_mm512_mask_sub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vsubps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5c,0xd1]		; CHECK-NEXT: vsubps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5c,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.sub.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.sub.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_sub_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_sub_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_sub_ps_128:		; CHECK-LABEL: test_mm512_sub_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vsubps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5c,0xc1]		; CHECK-NEXT: vsubps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5c,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.sub.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.sub.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}
declare <4 x float> @llvm.x86.avx512.mask.sub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.sub.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <8 x float> @test_mm512_maskz_mul_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_maskz_mul_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_mul_ps_256:		; CHECK-LABEL: test_mm512_maskz_mul_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmulps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x59,0xc1]		; CHECK-NEXT: vmulps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x59,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.mul.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.mul.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_mask_mul_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {		define <8 x float> @test_mm512_mask_mul_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_mul_ps_256:		; CHECK-LABEL: test_mm512_mask_mul_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmulps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x59,0xd1]		; CHECK-NEXT: vmulps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x59,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.mul.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.mul.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_mul_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_mul_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_mul_ps_256:		; CHECK-LABEL: test_mm512_mul_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmulps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x59,0xc1]		; CHECK-NEXT: vmulps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x59,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.mul.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.mul.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}
declare <8 x float> @llvm.x86.avx512.mask.mul.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.mul.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <4 x float> @test_mm512_maskz_mul_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_maskz_mul_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_mul_ps_128:		; CHECK-LABEL: test_mm512_maskz_mul_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmulps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x59,0xc1]		; CHECK-NEXT: vmulps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x59,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.mul.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.mul.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_mask_mul_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {		define <4 x float> @test_mm512_mask_mul_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_mul_ps_128:		; CHECK-LABEL: test_mm512_mask_mul_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmulps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x59,0xd1]		; CHECK-NEXT: vmulps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x59,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.mul.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.mul.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_mul_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_mul_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_mul_ps_128:		; CHECK-LABEL: test_mm512_mul_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmulps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x59,0xc1]		; CHECK-NEXT: vmulps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x59,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.mul.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.mul.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}
declare <4 x float> @llvm.x86.avx512.mask.mul.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.mul.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <8 x float> @test_mm512_maskz_div_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_maskz_div_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_div_ps_256:		; CHECK-LABEL: test_mm512_maskz_div_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vdivps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x5e,0xc1]		; CHECK-NEXT: vdivps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x5e,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.div.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.div.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_mask_div_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {		define <8 x float> @test_mm512_mask_div_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_div_ps_256:		; CHECK-LABEL: test_mm512_mask_div_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vdivps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5e,0xd1]		; CHECK-NEXT: vdivps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5e,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.div.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.div.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_div_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_div_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_div_ps_256:		; CHECK-LABEL: test_mm512_div_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vdivps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5e,0xc1]		; CHECK-NEXT: vdivps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5e,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.div.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.div.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}
declare <8 x float> @llvm.x86.avx512.mask.div.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.div.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <4 x float> @test_mm512_maskz_div_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_maskz_div_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_div_ps_128:		; CHECK-LABEL: test_mm512_maskz_div_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vdivps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x5e,0xc1]		; CHECK-NEXT: vdivps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x5e,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.div.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.div.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_mask_div_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {		define <4 x float> @test_mm512_mask_div_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_div_ps_128:		; CHECK-LABEL: test_mm512_mask_div_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vdivps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5e,0xd1]		; CHECK-NEXT: vdivps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5e,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.div.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.div.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_div_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_div_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_div_ps_128:		; CHECK-LABEL: test_mm512_div_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vdivps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5e,0xc1]		; CHECK-NEXT: vdivps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5e,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.div.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.div.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}
declare <4 x float> @llvm.x86.avx512.mask.div.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.div.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

declare <2 x double> @llvm.x86.avx512.mask.shuf.pd.128(<2 x double>, <2 x double>, i32, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.shuf.pd.128(<2 x double>, <2 x double>, i32, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_shuf_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x3, i8 %x4) {		define <2 x double>@test_int_x86_avx512_mask_shuf_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_shuf_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_shuf_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vshufpd $1, %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0xc6,0xd9,0x01]		; CHECK-NEXT: vshufpd $1, %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xc6,0xd9,0x01]
; CHECK-NEXT: ## xmm3 = xmm0[1],xmm1[0]		; CHECK-NEXT: ## xmm3 = xmm0[1],xmm1[0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshufpd $1, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xc6,0xd1,0x01]		; CHECK-NEXT: vshufpd $1, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xc6,0xd1,0x01]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[1],xmm1[0]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[1],xmm1[0]
; CHECK-NEXT: vshufpd $1, %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0xc6,0xc1,0x01]		; CHECK-NEXT: vshufpd $1, %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0xc6,0xc1,0x01]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[1],xmm1[0]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[1],xmm1[0]
; CHECK-NEXT: vaddpd %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0xed,0x08,0x58,0xcb]		; CHECK-NEXT: vaddpd %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x58,0xcb]
; CHECK-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.shuf.pd.128(<2 x double> %x0, <2 x double> %x1, i32 1, <2 x double> %x3, i8 %x4)		%res = call <2 x double> @llvm.x86.avx512.mask.shuf.pd.128(<2 x double> %x0, <2 x double> %x1, i32 1, <2 x double> %x3, i8 %x4)
%res1 = call <2 x double> @llvm.x86.avx512.mask.shuf.pd.128(<2 x double> %x0, <2 x double> %x1, i32 1, <2 x double> %x3, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.shuf.pd.128(<2 x double> %x0, <2 x double> %x1, i32 1, <2 x double> %x3, i8 -1)
%res2 = call <2 x double> @llvm.x86.avx512.mask.shuf.pd.128(<2 x double> %x0, <2 x double> %x1, i32 1, <2 x double> zeroinitializer, i8 %x4)		%res2 = call <2 x double> @llvm.x86.avx512.mask.shuf.pd.128(<2 x double> %x0, <2 x double> %x1, i32 1, <2 x double> zeroinitializer, i8 %x4)
%res3 = fadd <2 x double> %res, %res1		%res3 = fadd <2 x double> %res, %res1
%res4 = fadd <2 x double> %res2, %res3		%res4 = fadd <2 x double> %res2, %res3
ret <2 x double> %res4		ret <2 x double> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.shuf.pd.256(<4 x double>, <4 x double>, i32, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.shuf.pd.256(<4 x double>, <4 x double>, i32, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_shuf_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x3, i8 %x4) {		define <4 x double>@test_int_x86_avx512_mask_shuf_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_shuf_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_shuf_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vshufpd $6, %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0xc6,0xd9,0x06]		; CHECK-NEXT: vshufpd $6, %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xc6,0xd9,0x06]
; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[1],ymm0[3],ymm1[2]		; CHECK-NEXT: ## ymm3 = ymm0[0],ymm1[1],ymm0[3],ymm1[2]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshufpd $6, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xc6,0xd1,0x06]		; CHECK-NEXT: vshufpd $6, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xc6,0xd1,0x06]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[1],ymm0[3],ymm1[2]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0],ymm1[1],ymm0[3],ymm1[2]
; CHECK-NEXT: vaddpd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc3]		; CHECK-NEXT: vaddpd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.shuf.pd.256(<4 x double> %x0, <4 x double> %x1, i32 6, <4 x double> %x3, i8 %x4)		%res = call <4 x double> @llvm.x86.avx512.mask.shuf.pd.256(<4 x double> %x0, <4 x double> %x1, i32 6, <4 x double> %x3, i8 %x4)
%res1 = call <4 x double> @llvm.x86.avx512.mask.shuf.pd.256(<4 x double> %x0, <4 x double> %x1, i32 6, <4 x double> %x3, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.shuf.pd.256(<4 x double> %x0, <4 x double> %x1, i32 6, <4 x double> %x3, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.shuf.ps.128(<4 x float>, <4 x float>, i32, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.shuf.ps.128(<4 x float>, <4 x float>, i32, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_shuf_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x3, i8 %x4) {		define <4 x float>@test_int_x86_avx512_mask_shuf_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_shuf_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_shuf_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vshufps $22, %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0xc6,0xd9,0x16]		; CHECK-NEXT: vshufps $22, %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0xc6,0xd9,0x16]
; CHECK-NEXT: ## xmm3 = xmm0[2,1],xmm1[1,0]		; CHECK-NEXT: ## xmm3 = xmm0[2,1],xmm1[1,0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshufps $22, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0xc6,0xd1,0x16]		; CHECK-NEXT: vshufps $22, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0xc6,0xd1,0x16]
; CHECK-NEXT: ## xmm2 {%k1} = xmm0[2,1],xmm1[1,0]		; CHECK-NEXT: ## xmm2 {%k1} = xmm0[2,1],xmm1[1,0]
; CHECK-NEXT: vaddps %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6c,0x08,0x58,0xc3]		; CHECK-NEXT: vaddps %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe8,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.shuf.ps.128(<4 x float> %x0, <4 x float> %x1, i32 22, <4 x float> %x3, i8 %x4)		%res = call <4 x float> @llvm.x86.avx512.mask.shuf.ps.128(<4 x float> %x0, <4 x float> %x1, i32 22, <4 x float> %x3, i8 %x4)
%res1 = call <4 x float> @llvm.x86.avx512.mask.shuf.ps.128(<4 x float> %x0, <4 x float> %x1, i32 22, <4 x float> %x3, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.shuf.ps.128(<4 x float> %x0, <4 x float> %x1, i32 22, <4 x float> %x3, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.shuf.ps.256(<8 x float>, <8 x float>, i32, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.shuf.ps.256(<8 x float>, <8 x float>, i32, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_shuf_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x3, i8 %x4) {		define <8 x float>@test_int_x86_avx512_mask_shuf_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_shuf_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_shuf_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vshufps $22, %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0xc6,0xd9,0x16]		; CHECK-NEXT: vshufps $22, %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0xc6,0xd9,0x16]
; CHECK-NEXT: ## ymm3 = ymm0[2,1],ymm1[1,0],ymm0[6,5],ymm1[5,4]		; CHECK-NEXT: ## ymm3 = ymm0[2,1],ymm1[1,0],ymm0[6,5],ymm1[5,4]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshufps $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0xc6,0xd1,0x16]		; CHECK-NEXT: vshufps $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0xc6,0xd1,0x16]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[2,1],ymm1[1,0],ymm0[6,5],ymm1[5,4]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[2,1],ymm1[1,0],ymm0[6,5],ymm1[5,4]
; CHECK-NEXT: vaddps %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc3]		; CHECK-NEXT: vaddps %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.shuf.ps.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> %x3, i8 %x4)		%res = call <8 x float> @llvm.x86.avx512.mask.shuf.ps.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> %x3, i8 %x4)
%res1 = call <8 x float> @llvm.x86.avx512.mask.shuf.ps.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> %x3, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.shuf.ps.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> %x3, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pmaxs.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pmaxs.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pmaxs_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask) {		define <4 x i32>@test_int_x86_avx512_mask_pmaxs_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxsd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3d,0xd1]		; CHECK-NEXT: vpmaxsd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3d,0xd1]
; CHECK-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x3d,0xc1]		; CHECK-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x3d,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pmaxs.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2 ,i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pmaxs.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2 ,i8 %mask)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmaxs.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %mask)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmaxs.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %mask)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pmaxs.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pmaxs.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pmaxs_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_pmaxs_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmaxsd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x3d,0xd9]		; CHECK-NEXT: vpmaxsd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3d,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxsd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3d,0xd1]		; CHECK-NEXT: vpmaxsd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3d,0xd1]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pmaxs.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.pmaxs.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmaxs.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmaxs.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmaxs.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmaxs.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmaxs_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_pmaxs_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmaxsq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x3d,0xd9]		; CHECK-NEXT: vpmaxsq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x3d,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxsq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x3d,0xd1]		; CHECK-NEXT: vpmaxsq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x3d,0xd1]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmaxs.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmaxs.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmaxs.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmaxs.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmaxs.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmaxs.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmaxs_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask) {		define <4 x i64>@test_int_x86_avx512_mask_pmaxs_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxs_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxsq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x3d,0xd1]		; CHECK-NEXT: vpmaxsq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x3d,0xd1]
; CHECK-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x3d,0xc1]		; CHECK-NEXT: vpmaxsq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x3d,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmaxs.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmaxs.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmaxs.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %mask)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmaxs.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %mask)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pmaxu.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pmaxu.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pmaxu_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2,i8 %mask) {		define <4 x i32>@test_int_x86_avx512_mask_pmaxu_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2,i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxud %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3f,0xd1]		; CHECK-NEXT: vpmaxud %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3f,0xd1]
; CHECK-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x3f,0xc1]		; CHECK-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x3f,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pmaxu.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pmaxu.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmaxu.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %mask)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmaxu.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %mask)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pmaxu.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pmaxu.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pmaxu_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_pmaxu_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmaxud %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x3f,0xd9]		; CHECK-NEXT: vpmaxud %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3f,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxud %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3f,0xd1]		; CHECK-NEXT: vpmaxud %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3f,0xd1]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pmaxu.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.pmaxu.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmaxu.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmaxu.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmaxu.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmaxu.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmaxu_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_pmaxu_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmaxuq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x3f,0xd9]		; CHECK-NEXT: vpmaxuq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x3f,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxuq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x3f,0xd1]		; CHECK-NEXT: vpmaxuq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x3f,0xd1]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmaxu.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmaxu.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmaxu.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmaxu.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmaxu.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmaxu.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmaxu_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask) {		define <4 x i64>@test_int_x86_avx512_mask_pmaxu_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmaxu_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmaxuq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x3f,0xd1]		; CHECK-NEXT: vpmaxuq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x3f,0xd1]
; CHECK-NEXT: vpmaxuq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x3f,0xc1]		; CHECK-NEXT: vpmaxuq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x3f,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmaxu.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmaxu.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmaxu.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %mask)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmaxu.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %mask)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pmins.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pmins.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pmins_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask) {		define <4 x i32>@test_int_x86_avx512_mask_pmins_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmins_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmins_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminsd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x39,0xd1]		; CHECK-NEXT: vpminsd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x39,0xd1]
; CHECK-NEXT: vpminsd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x39,0xc1]		; CHECK-NEXT: vpminsd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x39,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pmins.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pmins.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmins.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %mask)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmins.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %mask)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pmins.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pmins.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pmins_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_pmins_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmins_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmins_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpminsd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x39,0xd9]		; CHECK-NEXT: vpminsd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x39,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminsd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x39,0xd1]		; CHECK-NEXT: vpminsd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x39,0xd1]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pmins.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.pmins.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmins.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmins.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmins.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmins.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmins_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_pmins_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmins_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmins_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpminsq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x39,0xd9]		; CHECK-NEXT: vpminsq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x39,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminsq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x39,0xd1]		; CHECK-NEXT: vpminsq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x39,0xd1]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmins.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmins.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmins.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmins.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmins.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmins.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmins_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask) {		define <4 x i64>@test_int_x86_avx512_mask_pmins_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmins_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmins_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminsq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x39,0xd1]		; CHECK-NEXT: vpminsq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x39,0xd1]
; CHECK-NEXT: vpminsq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x39,0xc1]		; CHECK-NEXT: vpminsq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x39,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmins.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmins.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmins.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %mask)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmins.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %mask)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pminu.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pminu.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pminu_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask) {		define <4 x i32>@test_int_x86_avx512_mask_pminu_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pminu_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pminu_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminud %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3b,0xd1]		; CHECK-NEXT: vpminud %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x3b,0xd1]
; CHECK-NEXT: vpminud %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x3b,0xc1]		; CHECK-NEXT: vpminud %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x3b,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pminu.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask)		%res = call <4 x i32> @llvm.x86.avx512.mask.pminu.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %mask)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pminu.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %mask)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pminu.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %mask)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pminu.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pminu.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pminu_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_pminu_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pminu_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pminu_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpminud %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x3b,0xd9]		; CHECK-NEXT: vpminud %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x3b,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminud %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3b,0xd1]		; CHECK-NEXT: vpminud %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x3b,0xd1]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pminu.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.pminu.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pminu.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pminu.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pminu.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pminu.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pminu_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_pminu_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pminu_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pminu_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpminuq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x3b,0xd9]		; CHECK-NEXT: vpminuq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x3b,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminuq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x3b,0xd1]		; CHECK-NEXT: vpminuq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x3b,0xd1]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pminu.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.pminu.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pminu.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pminu.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pminu.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pminu.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pminu_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask) {		define <4 x i64>@test_int_x86_avx512_mask_pminu_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pminu_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pminu_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpminuq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x3b,0xd1]		; CHECK-NEXT: vpminuq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x3b,0xd1]
; CHECK-NEXT: vpminuq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x3b,0xc1]		; CHECK-NEXT: vpminuq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x3b,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pminu.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask)		%res = call <4 x i64> @llvm.x86.avx512.mask.pminu.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %mask)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pminu.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %mask)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pminu.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %mask)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.psrl.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.psrl.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_psrl_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_psrl_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0xd3,0xd9]		; CHECK-NEXT: vpsrlq %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd3,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xd3,0xd1]		; CHECK-NEXT: vpsrlq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xd3,0xd1]
; CHECK-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0xd3,0xc1]		; CHECK-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0xd3,0xc1]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xcb]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xcb]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.psrl.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.psrl.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.psrl.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.psrl.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.psrl.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.psrl.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.psrl.q.256(<4 x i64>, <2 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.psrl.q.256(<4 x i64>, <2 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_psrl_q_256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_psrl_q_256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlq %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0xd3,0xd9]		; CHECK-NEXT: vpsrlq %xmm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd3,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlq %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xd3,0xd1]		; CHECK-NEXT: vpsrlq %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xd3,0xd1]
; CHECK-NEXT: vpsrlq %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0xd3,0xc1]		; CHECK-NEXT: vpsrlq %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0xd3,0xc1]
; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xcb]		; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xcb]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.psrl.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.psrl.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.psrl.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.psrl.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 -1)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.psrl.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.psrl.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psrl.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psrl.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_psrl_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_psrl_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrld %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0xd2,0xd9]		; CHECK-NEXT: vpsrld %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd2,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrld %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd2,0xd1]		; CHECK-NEXT: vpsrld %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xd2,0xd1]
; CHECK-NEXT: vpsrld %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd2,0xc1]		; CHECK-NEXT: vpsrld %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xd2,0xc1]
; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xcb]		; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xcb]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psrl.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.psrl.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.psrl.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.psrl.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.psrl.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.psrl.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psrl.d.256(<8 x i32>, <4 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psrl.d.256(<8 x i32>, <4 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_psrl_d_256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_psrl_d_256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrld %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0xd2,0xd9]		; CHECK-NEXT: vpsrld %xmm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd2,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrld %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd2,0xd1]		; CHECK-NEXT: vpsrld %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xd2,0xd1]
; CHECK-NEXT: vpsrld %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd2,0xc1]		; CHECK-NEXT: vpsrld %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xd2,0xc1]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xcb]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xcb]
; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc1]		; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psrl.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.psrl.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.psrl.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.psrl.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.psrl.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.psrl.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res2, %res3		%res4 = add <8 x i32> %res2, %res3
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psra.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psra.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_psra_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_psra_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrad %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0xe2,0xd9]		; CHECK-NEXT: vpsrad %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe2,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrad %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe2,0xd1]		; CHECK-NEXT: vpsrad %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xe2,0xd1]
; CHECK-NEXT: vpsrad %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe2,0xc1]		; CHECK-NEXT: vpsrad %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xe2,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc3]		; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psra.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.psra.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.psra.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.psra.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.psra.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.psra.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psra.d.256(<8 x i32>, <4 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psra.d.256(<8 x i32>, <4 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_psra_d_256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_psra_d_256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrad %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0xe2,0xd9]		; CHECK-NEXT: vpsrad %xmm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe2,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrad %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe2,0xd1]		; CHECK-NEXT: vpsrad %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xe2,0xd1]
; CHECK-NEXT: vpsrad %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe2,0xc1]		; CHECK-NEXT: vpsrad %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xe2,0xc1]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psra.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.psra.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.psra.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.psra.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.psra.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.psra.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psll.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psll.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_psll_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_psll_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpslld %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7d,0x08,0xf2,0xd9]		; CHECK-NEXT: vpslld %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf2,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpslld %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf2,0xd1]		; CHECK-NEXT: vpslld %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0xf2,0xd1]
; CHECK-NEXT: vpslld %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xf2,0xc1]		; CHECK-NEXT: vpslld %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0xf2,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc3]		; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psll.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.psll.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.psll.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.psll.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.psll.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.psll.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psll.d.256(<8 x i32>, <4 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psll.d.256(<8 x i32>, <4 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_psll_d_256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_psll_d_256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpslld %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7d,0x28,0xf2,0xd9]		; CHECK-NEXT: vpslld %xmm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf2,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpslld %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf2,0xd1]		; CHECK-NEXT: vpslld %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0xf2,0xd1]
; CHECK-NEXT: vpslld %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xf2,0xc1]		; CHECK-NEXT: vpslld %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0xf2,0xc1]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psll.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.psll.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.psll.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.psll.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.psll.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.psll.d.256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.psll.q.256(<4 x i64>, <2 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.psll.q.256(<4 x i64>, <2 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_psll_q_256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_psll_q_256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllq %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0xf3,0xd9]		; CHECK-NEXT: vpsllq %xmm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf3,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllq %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xf3,0xd1]		; CHECK-NEXT: vpsllq %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xf3,0xd1]
; CHECK-NEXT: vpsllq %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0xf3,0xc1]		; CHECK-NEXT: vpsllq %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0xf3,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc3]		; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.psll.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.psll.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.psll.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.psll.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.psll.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.psll.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.psrl.qi.128(<2 x i64>, i32, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.psrl.qi.128(<2 x i64>, i32, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_psrl_qi_128(<2 x i64> %x0, i32 %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_psrl_qi_128(<2 x i64> %x0, i32 %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_qi_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_qi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlq $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0xed,0x08,0x73,0xd0,0x03]		; CHECK-NEXT: vpsrlq $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x73,0xd0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsrlq $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x73,0xd0,0x03]		; CHECK-NEXT: vpsrlq $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x73,0xd0,0x03]
; CHECK-NEXT: vpsrlq $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x73,0xd0,0x03]		; CHECK-NEXT: vpsrlq $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x73,0xd0,0x03]
; CHECK-NEXT: vpaddq %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xca]		; CHECK-NEXT: vpaddq %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xca]
; CHECK-NEXT: vpaddq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc1]		; CHECK-NEXT: vpaddq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.psrl.qi.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.psrl.qi.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.psrl.qi.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.psrl.qi.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 -1)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.psrl.qi.128(<2 x i64> %x0, i32 3, <2 x i64> zeroinitializer, i8 %x3)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.psrl.qi.128(<2 x i64> %x0, i32 3, <2 x i64> zeroinitializer, i8 %x3)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res2, %res3		%res4 = add <2 x i64> %res2, %res3
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.psrl.qi.256(<4 x i64>, i32, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.psrl.qi.256(<4 x i64>, i32, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_psrl_qi_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_psrl_qi_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_qi_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_qi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlq $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0xed,0x28,0x73,0xd0,0x03]		; CHECK-NEXT: vpsrlq $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x73,0xd0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsrlq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x73,0xd0,0x03]		; CHECK-NEXT: vpsrlq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x73,0xd0,0x03]
; CHECK-NEXT: vpsrlq $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x73,0xd0,0x03]		; CHECK-NEXT: vpsrlq $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x73,0xd0,0x03]
; CHECK-NEXT: vpaddq %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xca]		; CHECK-NEXT: vpaddq %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xca]
; CHECK-NEXT: vpaddq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc1]		; CHECK-NEXT: vpaddq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.psrl.qi.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.psrl.qi.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.psrl.qi.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.psrl.qi.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.psrl.qi.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.psrl.qi.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res2, %res3		%res4 = add <4 x i64> %res2, %res3
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psrl.di.128(<4 x i32>, i32, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psrl.di.128(<4 x i32>, i32, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_psrl_di_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_psrl_di_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_di_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_di_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrld $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0x72,0xd0,0x03]		; CHECK-NEXT: vpsrld $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x72,0xd0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsrld $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x72,0xd0,0x03]		; CHECK-NEXT: vpsrld $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x72,0xd0,0x03]
; CHECK-NEXT: vpsrld $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x72,0xd0,0x03]		; CHECK-NEXT: vpsrld $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x72,0xd0,0x03]
; CHECK-NEXT: vpaddd %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xca]		; CHECK-NEXT: vpaddd %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xca]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psrl.di.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.psrl.di.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.psrl.di.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.psrl.di.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.psrl.di.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.psrl.di.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res2, %res3		%res4 = add <4 x i32> %res2, %res3
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psrl.di.256(<8 x i32>, i32, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psrl.di.256(<8 x i32>, i32, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_psrl_di_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_psrl_di_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrl_di_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psrl_di_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrld $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0x72,0xd0,0x03]		; CHECK-NEXT: vpsrld $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x72,0xd0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsrld $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x72,0xd0,0x03]		; CHECK-NEXT: vpsrld $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x72,0xd0,0x03]
; CHECK-NEXT: vpsrld $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x72,0xd0,0x03]		; CHECK-NEXT: vpsrld $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x72,0xd0,0x03]
; CHECK-NEXT: vpaddd %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xca]		; CHECK-NEXT: vpaddd %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xca]
; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc1]		; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psrl.di.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.psrl.di.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.psrl.di.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.psrl.di.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.psrl.di.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.psrl.di.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res2, %res3		%res4 = add <8 x i32> %res2, %res3
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psll.di.128(<4 x i32>, i32, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psll.di.128(<4 x i32>, i32, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_psll_di_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_psll_di_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_di_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_di_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpslld $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0x72,0xf0,0x03]		; CHECK-NEXT: vpslld $3, %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x72,0xf0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpslld $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x72,0xf0,0x03]		; CHECK-NEXT: vpslld $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x72,0xf0,0x03]
; CHECK-NEXT: vpslld $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x72,0xf0,0x03]		; CHECK-NEXT: vpslld $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x72,0xf0,0x03]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psll.di.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.psll.di.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.psll.di.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.psll.di.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.psll.di.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.psll.di.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psll.di.256(<8 x i32>, i32, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psll.di.256(<8 x i32>, i32, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_psll_di_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_psll_di_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psll_di_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psll_di_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpslld $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0x72,0xf0,0x03]		; CHECK-NEXT: vpslld $3, %ymm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x72,0xf0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpslld $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x72,0xf0,0x03]		; CHECK-NEXT: vpslld $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x72,0xf0,0x03]
; CHECK-NEXT: vpslld $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x72,0xf0,0x03]		; CHECK-NEXT: vpslld $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x72,0xf0,0x03]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc2]		; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psll.di.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.psll.di.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.psll.di.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.psll.di.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.psll.di.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.psll.di.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.psrlv2.di(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.psrlv2.di(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_psrlv2_di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_psrlv2_di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrlv2_di:		; CHECK-LABEL: test_int_x86_avx512_mask_psrlv2_di:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlvq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x45,0xd9]		; CHECK-NEXT: vpsrlvq %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x45,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlvq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x45,0xd1]		; CHECK-NEXT: vpsrlvq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x45,0xd1]
; CHECK-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x45,0xc1]		; CHECK-NEXT: vpsrlvq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x45,0xc1]
; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.psrlv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.psrlv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.psrlv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.psrlv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.psrlv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.psrlv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.psrlv4.di(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.psrlv4.di(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_psrlv4_di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_psrlv4_di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrlv4_di:		; CHECK-LABEL: test_int_x86_avx512_mask_psrlv4_di:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlvq %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x45,0xd9]		; CHECK-NEXT: vpsrlvq %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x45,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlvq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x45,0xd1]		; CHECK-NEXT: vpsrlvq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x45,0xd1]
; CHECK-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x45,0xc1]		; CHECK-NEXT: vpsrlvq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x45,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc3]		; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.psrlv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.psrlv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.psrlv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.psrlv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.psrlv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.psrlv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psrlv4.si(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psrlv4.si(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_psrlv4_si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_psrlv4_si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrlv4_si:		; CHECK-LABEL: test_int_x86_avx512_mask_psrlv4_si:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlvd %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0x7d,0x08,0x45,0xd9]		; CHECK-NEXT: vpsrlvd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x45,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlvd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x45,0xd1]		; CHECK-NEXT: vpsrlvd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x45,0xd1]
; CHECK-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x45,0xc1]		; CHECK-NEXT: vpsrlvd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x45,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc3]		; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psrlv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.psrlv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.psrlv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.psrlv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.psrlv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.psrlv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psrlv8.si(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psrlv8.si(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_psrlv8_si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_psrlv8_si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrlv8_si:		; CHECK-LABEL: test_int_x86_avx512_mask_psrlv8_si:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsrlvd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x45,0xd9]		; CHECK-NEXT: vpsrlvd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x45,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsrlvd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x45,0xd1]		; CHECK-NEXT: vpsrlvd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x45,0xd1]
; CHECK-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x45,0xc1]		; CHECK-NEXT: vpsrlvd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x45,0xc1]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psrlv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.psrlv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.psrlv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.psrlv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.psrlv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.psrlv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psrav4.si(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psrav4.si(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_psrav4_si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_psrav4_si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrav4_si:		; CHECK-LABEL: test_int_x86_avx512_mask_psrav4_si:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsravd %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0x7d,0x08,0x46,0xd9]		; CHECK-NEXT: vpsravd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x46,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsravd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x46,0xd1]		; CHECK-NEXT: vpsravd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x46,0xd1]
; CHECK-NEXT: vpsravd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x46,0xc1]		; CHECK-NEXT: vpsravd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x46,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc3]		; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psrav4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.psrav4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.psrav4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.psrav4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.psrav4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.psrav4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_psrav8_si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_psrav8_si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrav8_si:		; CHECK-LABEL: test_int_x86_avx512_mask_psrav8_si:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsravd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x46,0xd9]		; CHECK-NEXT: vpsravd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsravd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x46,0xd1]		; CHECK-NEXT: vpsravd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x46,0xd1]
; CHECK-NEXT: vpsravd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x46,0xc1]		; CHECK-NEXT: vpsravd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x46,0xc1]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

define <8 x i32>@test_int_x86_avx512_mask_psrav8_si_const() {		define <8 x i32>@test_int_x86_avx512_mask_psrav8_si_const() {
; CHECK-LABEL: test_int_x86_avx512_mask_psrav8_si_const:		; CHECK-LABEL: test_int_x86_avx512_mask_psrav8_si_const:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqa32 {{.*#+}} ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]		; CHECK-NEXT: vmovdqa {{.*}}(%rip), %ymm0 ## EVEX TO VEX Compression ymm0 = [2,9,4294967284,23,4294967270,37,4294967256,51]
; CHECK-NEXT: ## encoding: [0x62,0xf1,0x7d,0x28,0x6f,0x05,A,A,A,A]		; CHECK-NEXT: ## encoding: [0xc5,0xfd,0x6f,0x05,A,A,A,A]
; CHECK-NEXT: ## fixup A - offset: 6, value: LCPI276_0-4, kind: reloc_riprel_4byte		; CHECK-NEXT: ## fixup A - offset: 4, value: LCPI276_0-4, kind: reloc_riprel_4byte
; CHECK-NEXT: vpsravd {{.*}}(%rip), %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x46,0x05,A,A,A,A]		; CHECK-NEXT: vpsravd {{.*}}(%rip), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x46,0x05,A,A,A,A]
; CHECK-NEXT: ## fixup A - offset: 6, value: LCPI276_1-4, kind: reloc_riprel_4byte		; CHECK-NEXT: ## fixup A - offset: 5, value: LCPI276_1-4, kind: reloc_riprel_4byte
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32> <i32 2, i32 9, i32 -12, i32 23, i32 -26, i32 37, i32 -40, i32 51>, <8 x i32> <i32 1, i32 18, i32 35, i32 52, i32 69, i32 15, i32 32, i32 49>, <8 x i32> zeroinitializer, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.psrav8.si(<8 x i32> <i32 2, i32 9, i32 -12, i32 23, i32 -26, i32 37, i32 -40, i32 51>, <8 x i32> <i32 1, i32 18, i32 35, i32 52, i32 69, i32 15, i32 32, i32 49>, <8 x i32> zeroinitializer, i8 -1)
ret <8 x i32> %res		ret <8 x i32> %res
}		}

declare <2 x i64> @llvm.x86.avx512.mask.psllv2.di(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.psllv2.di(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_psllv2_di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_psllv2_di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psllv2_di:		; CHECK-LABEL: test_int_x86_avx512_mask_psllv2_di:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllvq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x47,0xd9]		; CHECK-NEXT: vpsllvq %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0x47,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllvq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x47,0xd1]		; CHECK-NEXT: vpsllvq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x47,0xd1]
; CHECK-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x47,0xc1]		; CHECK-NEXT: vpsllvq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x47,0xc1]
; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.psllv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.psllv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.psllv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.psllv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.psllv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.psllv2.di(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.psllv4.di(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.psllv4.di(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_psllv4_di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_psllv4_di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psllv4_di:		; CHECK-LABEL: test_int_x86_avx512_mask_psllv4_di:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllvq %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x47,0xd9]		; CHECK-NEXT: vpsllvq %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xfd,0x47,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllvq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x47,0xd1]		; CHECK-NEXT: vpsllvq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x47,0xd1]
; CHECK-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x47,0xc1]		; CHECK-NEXT: vpsllvq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x47,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc3]		; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.psllv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.psllv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.psllv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.psllv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.psllv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.psllv4.di(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.psllv4.si(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.psllv4.si(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_psllv4_si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_psllv4_si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psllv4_si:		; CHECK-LABEL: test_int_x86_avx512_mask_psllv4_si:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllvd %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0x7d,0x08,0x47,0xd9]		; CHECK-NEXT: vpsllvd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x47,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllvd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x47,0xd1]		; CHECK-NEXT: vpsllvd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x47,0xd1]
; CHECK-NEXT: vpsllvd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x47,0xc1]		; CHECK-NEXT: vpsllvd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x47,0xc1]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc3]		; CHECK-NEXT: vpaddd %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.psllv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.psllv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.psllv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.psllv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.psllv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.psllv4.si(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.psllv8.si(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.psllv8.si(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_psllv8_si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_psllv8_si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psllv8_si:		; CHECK-LABEL: test_int_x86_avx512_mask_psllv8_si:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsllvd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x47,0xd9]		; CHECK-NEXT: vpsllvd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x47,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsllvd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x47,0xd1]		; CHECK-NEXT: vpsllvd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x47,0xd1]
; CHECK-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x47,0xc1]		; CHECK-NEXT: vpsllvd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x47,0xc1]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.psllv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.psllv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.psllv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.psllv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.psllv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.psllv8.si(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pmovzxb.d.128(<16 x i8>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pmovzxb.d.128(<16 x i8>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pmovzxb_d_128(<16 x i8> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmovzxb_d_128(<16 x i8> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxbd %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x31,0xd0]		; CHECK-NEXT: vpmovzxbd %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x31,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero		; CHECK-NEXT: ## xmm2 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxbd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x31,0xc8]		; CHECK-NEXT: vpmovzxbd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x31,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero
; CHECK-NEXT: vpmovzxbd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x31,0xc0]		; CHECK-NEXT: vpmovzxbd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x31,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pmovzxb.d.128(<16 x i8> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.pmovzxb.d.128(<16 x i8> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovzxb.d.128(<16 x i8> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovzxb.d.128(<16 x i8> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovzxb.d.128(<16 x i8> %x0, <4 x i32> %x1, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovzxb.d.128(<16 x i8> %x0, <4 x i32> %x1, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pmovzxb.d.256(<16 x i8>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pmovzxb.d.256(<16 x i8>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pmovzxb_d_256(<16 x i8> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_pmovzxb_d_256(<16 x i8> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxbd %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x31,0xd0]		; CHECK-NEXT: vpmovzxbd %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x31,0xd0]
; CHECK-NEXT: ## ymm2 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero		; CHECK-NEXT: ## ymm2 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxbd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x31,0xc8]		; CHECK-NEXT: vpmovzxbd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x31,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero		; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero
; CHECK-NEXT: vpmovzxbd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x31,0xc0]		; CHECK-NEXT: vpmovzxbd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x31,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero		; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc2]		; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pmovzxb.d.256(<16 x i8> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.pmovzxb.d.256(<16 x i8> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmovzxb.d.256(<16 x i8> %x0, <8 x i32> zeroinitializer, i8 %x2)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmovzxb.d.256(<16 x i8> %x0, <8 x i32> zeroinitializer, i8 %x2)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.pmovzxb.d.256(<16 x i8> %x0, <8 x i32> %x1, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.pmovzxb.d.256(<16 x i8> %x0, <8 x i32> %x1, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmovzxb.q.128(<16 x i8>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmovzxb.q.128(<16 x i8>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmovzxb_q_128(<16 x i8> %x0, <2 x i64> %x1, i8 %x2) {		define <2 x i64>@test_int_x86_avx512_mask_pmovzxb_q_128(<16 x i8> %x0, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxbq %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x32,0xd0]		; CHECK-NEXT: vpmovzxbq %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x32,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero		; CHECK-NEXT: ## xmm2 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxbq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x32,0xc8]		; CHECK-NEXT: vpmovzxbq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x32,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero
; CHECK-NEXT: vpmovzxbq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x32,0xc0]		; CHECK-NEXT: vpmovzxbq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x32,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc2]		; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmovzxb.q.128(<16 x i8> %x0, <2 x i64> %x1, i8 %x2)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmovzxb.q.128(<16 x i8> %x0, <2 x i64> %x1, i8 %x2)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxb.q.128(<16 x i8> %x0, <2 x i64> zeroinitializer, i8 %x2)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxb.q.128(<16 x i8> %x0, <2 x i64> zeroinitializer, i8 %x2)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxb.q.128(<16 x i8> %x0, <2 x i64> %x1, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxb.q.128(<16 x i8> %x0, <2 x i64> %x1, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmovzxb.q.256(<16 x i8>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmovzxb.q.256(<16 x i8>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmovzxb_q_256(<16 x i8> %x0, <4 x i64> %x1, i8 %x2) {		define <4 x i64>@test_int_x86_avx512_mask_pmovzxb_q_256(<16 x i8> %x0, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxb_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxbq %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x32,0xd0]		; CHECK-NEXT: vpmovzxbq %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x32,0xd0]
; CHECK-NEXT: ## ymm2 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero		; CHECK-NEXT: ## ymm2 = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxbq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x32,0xc8]		; CHECK-NEXT: vpmovzxbq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x32,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero		; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero
; CHECK-NEXT: vpmovzxbq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x32,0xc0]		; CHECK-NEXT: vpmovzxbq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x32,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero		; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,zero,zero,zero,zero,zero,zero,xmm0[1],zero,zero,zero,zero,zero,zero,zero,xmm0[2],zero,zero,zero,zero,zero,zero,zero,xmm0[3],zero,zero,zero,zero,zero,zero,zero
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc2]		; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmovzxb.q.256(<16 x i8> %x0, <4 x i64> %x1, i8 %x2)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmovzxb.q.256(<16 x i8> %x0, <4 x i64> %x1, i8 %x2)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxb.q.256(<16 x i8> %x0, <4 x i64> zeroinitializer, i8 %x2)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxb.q.256(<16 x i8> %x0, <4 x i64> zeroinitializer, i8 %x2)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxb.q.256(<16 x i8> %x0, <4 x i64> %x1, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxb.q.256(<16 x i8> %x0, <4 x i64> %x1, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmovzxd.q.128(<4 x i32>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmovzxd.q.128(<4 x i32>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmovzxd_q_128(<4 x i32> %x0, <2 x i64> %x1, i8 %x2) {		define <2 x i64>@test_int_x86_avx512_mask_pmovzxd_q_128(<4 x i32> %x0, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxd_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxd_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxdq %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x35,0xd0]		; CHECK-NEXT: vpmovzxdq %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x35,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[0],zero,xmm0[1],zero		; CHECK-NEXT: ## xmm2 = xmm0[0],zero,xmm0[1],zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxdq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x35,0xc8]		; CHECK-NEXT: vpmovzxdq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x35,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,xmm0[1],zero		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,xmm0[1],zero
; CHECK-NEXT: vpmovzxdq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x35,0xc0]		; CHECK-NEXT: vpmovzxdq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x35,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc2]		; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmovzxd.q.128(<4 x i32> %x0, <2 x i64> %x1, i8 %x2)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmovzxd.q.128(<4 x i32> %x0, <2 x i64> %x1, i8 %x2)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxd.q.128(<4 x i32> %x0, <2 x i64> zeroinitializer, i8 %x2)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxd.q.128(<4 x i32> %x0, <2 x i64> zeroinitializer, i8 %x2)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxd.q.128(<4 x i32> %x0, <2 x i64> %x1, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxd.q.128(<4 x i32> %x0, <2 x i64> %x1, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmovzxd.q.256(<4 x i32>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmovzxd.q.256(<4 x i32>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmovzxd_q_256(<4 x i32> %x0, <4 x i64> %x1, i8 %x2) {		define <4 x i64>@test_int_x86_avx512_mask_pmovzxd_q_256(<4 x i32> %x0, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxd_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxd_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxdq %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x35,0xd0]		; CHECK-NEXT: vpmovzxdq %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x35,0xd0]
; CHECK-NEXT: ## ymm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero		; CHECK-NEXT: ## ymm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxdq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x35,0xc8]		; CHECK-NEXT: vpmovzxdq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x35,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero		; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
; CHECK-NEXT: vpmovzxdq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x35,0xc0]		; CHECK-NEXT: vpmovzxdq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x35,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero		; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc2]		; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmovzxd.q.256(<4 x i32> %x0, <4 x i64> %x1, i8 %x2)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmovzxd.q.256(<4 x i32> %x0, <4 x i64> %x1, i8 %x2)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxd.q.256(<4 x i32> %x0, <4 x i64> zeroinitializer, i8 %x2)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxd.q.256(<4 x i32> %x0, <4 x i64> zeroinitializer, i8 %x2)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxd.q.256(<4 x i32> %x0, <4 x i64> %x1, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxd.q.256(<4 x i32> %x0, <4 x i64> %x1, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pmovzxw.d.128(<8 x i16>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pmovzxw.d.128(<8 x i16>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pmovzxw_d_128(<8 x i16> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmovzxw_d_128(<8 x i16> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxw_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxw_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxwd %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x33,0xd0]		; CHECK-NEXT: vpmovzxwd %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x33,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero		; CHECK-NEXT: ## xmm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxwd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x33,0xc8]		; CHECK-NEXT: vpmovzxwd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x33,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
; CHECK-NEXT: vpmovzxwd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x33,0xc0]		; CHECK-NEXT: vpmovzxwd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x33,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pmovzxw.d.128(<8 x i16> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.pmovzxw.d.128(<8 x i16> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovzxw.d.128(<8 x i16> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovzxw.d.128(<8 x i16> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovzxw.d.128(<8 x i16> %x0, <4 x i32> %x1, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovzxw.d.128(<8 x i16> %x0, <4 x i32> %x1, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pmovzxw.d.256(<8 x i16>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pmovzxw.d.256(<8 x i16>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pmovzxw_d_256(<8 x i16> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_pmovzxw_d_256(<8 x i16> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxw_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxw_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxwd %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x33,0xd0]		; CHECK-NEXT: vpmovzxwd %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x33,0xd0]
; CHECK-NEXT: ## ymm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero		; CHECK-NEXT: ## ymm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxwd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x33,0xc8]		; CHECK-NEXT: vpmovzxwd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x33,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero		; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
; CHECK-NEXT: vpmovzxwd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x33,0xc0]		; CHECK-NEXT: vpmovzxwd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x33,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero		; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc2]		; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pmovzxw.d.256(<8 x i16> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.pmovzxw.d.256(<8 x i16> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmovzxw.d.256(<8 x i16> %x0, <8 x i32> zeroinitializer, i8 %x2)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmovzxw.d.256(<8 x i16> %x0, <8 x i32> zeroinitializer, i8 %x2)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.pmovzxw.d.256(<8 x i16> %x0, <8 x i32> %x1, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.pmovzxw.d.256(<8 x i16> %x0, <8 x i32> %x1, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmovzxw.q.128(<8 x i16>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmovzxw.q.128(<8 x i16>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmovzxw_q_128(<8 x i16> %x0, <2 x i64> %x1, i8 %x2) {		define <2 x i64>@test_int_x86_avx512_mask_pmovzxw_q_128(<8 x i16> %x0, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxw_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxw_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxwq %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x34,0xd0]		; CHECK-NEXT: vpmovzxwq %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x34,0xd0]
; CHECK-NEXT: ## xmm2 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero		; CHECK-NEXT: ## xmm2 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxwq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x34,0xc8]		; CHECK-NEXT: vpmovzxwq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x34,0xc8]
; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero		; CHECK-NEXT: ## xmm1 {%k1} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero
; CHECK-NEXT: vpmovzxwq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x34,0xc0]		; CHECK-NEXT: vpmovzxwq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x34,0xc0]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc2]		; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmovzxw.q.128(<8 x i16> %x0, <2 x i64> %x1, i8 %x2)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmovzxw.q.128(<8 x i16> %x0, <2 x i64> %x1, i8 %x2)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxw.q.128(<8 x i16> %x0, <2 x i64> zeroinitializer, i8 %x2)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxw.q.128(<8 x i16> %x0, <2 x i64> zeroinitializer, i8 %x2)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxw.q.128(<8 x i16> %x0, <2 x i64> %x1, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovzxw.q.128(<8 x i16> %x0, <2 x i64> %x1, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmovzxw.q.256(<8 x i16>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmovzxw.q.256(<8 x i16>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmovzxw_q_256(<8 x i16> %x0, <4 x i64> %x1, i8 %x2) {		define <4 x i64>@test_int_x86_avx512_mask_pmovzxw_q_256(<8 x i16> %x0, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxw_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovzxw_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovzxwq %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x34,0xd0]		; CHECK-NEXT: vpmovzxwq %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x34,0xd0]
; CHECK-NEXT: ## ymm2 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero		; CHECK-NEXT: ## ymm2 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovzxwq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x34,0xc8]		; CHECK-NEXT: vpmovzxwq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x34,0xc8]
; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero		; CHECK-NEXT: ## ymm1 {%k1} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero
; CHECK-NEXT: vpmovzxwq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x34,0xc0]		; CHECK-NEXT: vpmovzxwq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x34,0xc0]
; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero		; CHECK-NEXT: ## ymm0 {%k1} {z} = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc2]		; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmovzxw.q.256(<8 x i16> %x0, <4 x i64> %x1, i8 %x2)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmovzxw.q.256(<8 x i16> %x0, <4 x i64> %x1, i8 %x2)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxw.q.256(<8 x i16> %x0, <4 x i64> zeroinitializer, i8 %x2)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxw.q.256(<8 x i16> %x0, <4 x i64> zeroinitializer, i8 %x2)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxw.q.256(<8 x i16> %x0, <4 x i64> %x1, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovzxw.q.256(<8 x i16> %x0, <4 x i64> %x1, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pmovsxb.d.128(<16 x i8>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pmovsxb.d.128(<16 x i8>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pmovsxb_d_128(<16 x i8> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmovsxb_d_128(<16 x i8> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxbd %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x21,0xd0]		; CHECK-NEXT: vpmovsxbd %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x21,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxbd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x21,0xc8]		; CHECK-NEXT: vpmovsxbd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x21,0xc8]
; CHECK-NEXT: vpmovsxbd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x21,0xc0]		; CHECK-NEXT: vpmovsxbd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x21,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pmovsxb.d.128(<16 x i8> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.pmovsxb.d.128(<16 x i8> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovsxb.d.128(<16 x i8> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovsxb.d.128(<16 x i8> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovsxb.d.128(<16 x i8> %x0, <4 x i32> %x1, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovsxb.d.128(<16 x i8> %x0, <4 x i32> %x1, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pmovsxb.d.256(<16 x i8>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pmovsxb.d.256(<16 x i8>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pmovsxb_d_256(<16 x i8> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_pmovsxb_d_256(<16 x i8> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxbd %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x21,0xd0]		; CHECK-NEXT: vpmovsxbd %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x21,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxbd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x21,0xc8]		; CHECK-NEXT: vpmovsxbd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x21,0xc8]
; CHECK-NEXT: vpmovsxbd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x21,0xc0]		; CHECK-NEXT: vpmovsxbd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x21,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc2]		; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pmovsxb.d.256(<16 x i8> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.pmovsxb.d.256(<16 x i8> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmovsxb.d.256(<16 x i8> %x0, <8 x i32> zeroinitializer, i8 %x2)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmovsxb.d.256(<16 x i8> %x0, <8 x i32> zeroinitializer, i8 %x2)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.pmovsxb.d.256(<16 x i8> %x0, <8 x i32> %x1, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.pmovsxb.d.256(<16 x i8> %x0, <8 x i32> %x1, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmovsxb.q.128(<16 x i8>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmovsxb.q.128(<16 x i8>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmovsxb_q_128(<16 x i8> %x0, <2 x i64> %x1, i8 %x2) {		define <2 x i64>@test_int_x86_avx512_mask_pmovsxb_q_128(<16 x i8> %x0, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxbq %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x22,0xd0]		; CHECK-NEXT: vpmovsxbq %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x22,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxbq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x22,0xc8]		; CHECK-NEXT: vpmovsxbq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x22,0xc8]
; CHECK-NEXT: vpmovsxbq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x22,0xc0]		; CHECK-NEXT: vpmovsxbq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x22,0xc0]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc2]		; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmovsxb.q.128(<16 x i8> %x0, <2 x i64> %x1, i8 %x2)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmovsxb.q.128(<16 x i8> %x0, <2 x i64> %x1, i8 %x2)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxb.q.128(<16 x i8> %x0, <2 x i64> zeroinitializer, i8 %x2)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxb.q.128(<16 x i8> %x0, <2 x i64> zeroinitializer, i8 %x2)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxb.q.128(<16 x i8> %x0, <2 x i64> %x1, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxb.q.128(<16 x i8> %x0, <2 x i64> %x1, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmovsxb.q.256(<16 x i8>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmovsxb.q.256(<16 x i8>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmovsxb_q_256(<16 x i8> %x0, <4 x i64> %x1, i8 %x2) {		define <4 x i64>@test_int_x86_avx512_mask_pmovsxb_q_256(<16 x i8> %x0, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxb_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxbq %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x22,0xd0]		; CHECK-NEXT: vpmovsxbq %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x22,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxbq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x22,0xc8]		; CHECK-NEXT: vpmovsxbq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x22,0xc8]
; CHECK-NEXT: vpmovsxbq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x22,0xc0]		; CHECK-NEXT: vpmovsxbq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x22,0xc0]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc2]		; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmovsxb.q.256(<16 x i8> %x0, <4 x i64> %x1, i8 %x2)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmovsxb.q.256(<16 x i8> %x0, <4 x i64> %x1, i8 %x2)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxb.q.256(<16 x i8> %x0, <4 x i64> zeroinitializer, i8 %x2)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxb.q.256(<16 x i8> %x0, <4 x i64> zeroinitializer, i8 %x2)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxb.q.256(<16 x i8> %x0, <4 x i64> %x1, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxb.q.256(<16 x i8> %x0, <4 x i64> %x1, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pmovsxw.d.128(<8 x i16>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pmovsxw.d.128(<8 x i16>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pmovsxw_d_128(<8 x i16> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmovsxw_d_128(<8 x i16> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxw_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxw_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxwd %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x23,0xd0]		; CHECK-NEXT: vpmovsxwd %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x23,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxwd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x23,0xc8]		; CHECK-NEXT: vpmovsxwd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x23,0xc8]
; CHECK-NEXT: vpmovsxwd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x23,0xc0]		; CHECK-NEXT: vpmovsxwd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x23,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pmovsxw.d.128(<8 x i16> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.pmovsxw.d.128(<8 x i16> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovsxw.d.128(<8 x i16> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovsxw.d.128(<8 x i16> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovsxw.d.128(<8 x i16> %x0, <4 x i32> %x1, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovsxw.d.128(<8 x i16> %x0, <4 x i32> %x1, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pmovsxw.d.256(<8 x i16>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pmovsxw.d.256(<8 x i16>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pmovsxw_d_256(<8 x i16> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_pmovsxw_d_256(<8 x i16> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxw_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxw_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxwd %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x23,0xd0]		; CHECK-NEXT: vpmovsxwd %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x23,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxwd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x23,0xc8]		; CHECK-NEXT: vpmovsxwd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x23,0xc8]
; CHECK-NEXT: vpmovsxwd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x23,0xc0]		; CHECK-NEXT: vpmovsxwd %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x23,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc2]		; CHECK-NEXT: vpaddd %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pmovsxw.d.256(<8 x i16> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.pmovsxw.d.256(<8 x i16> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmovsxw.d.256(<8 x i16> %x0, <8 x i32> zeroinitializer, i8 %x2)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pmovsxw.d.256(<8 x i16> %x0, <8 x i32> zeroinitializer, i8 %x2)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.pmovsxw.d.256(<8 x i16> %x0, <8 x i32> %x1, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.pmovsxw.d.256(<8 x i16> %x0, <8 x i32> %x1, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pmovsxw.q.128(<8 x i16>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pmovsxw.q.128(<8 x i16>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pmovsxw_q_128(<8 x i16> %x0, <2 x i64> %x1, i8 %x2) {		define <2 x i64>@test_int_x86_avx512_mask_pmovsxw_q_128(<8 x i16> %x0, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxw_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxw_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxwq %xmm0, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x24,0xd0]		; CHECK-NEXT: vpmovsxwq %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x24,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxwq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x24,0xc8]		; CHECK-NEXT: vpmovsxwq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x24,0xc8]
; CHECK-NEXT: vpmovsxwq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x24,0xc0]		; CHECK-NEXT: vpmovsxwq %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x24,0xc0]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc2]		; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pmovsxw.q.128(<8 x i16> %x0, <2 x i64> %x1, i8 %x2)		%res = call <2 x i64> @llvm.x86.avx512.mask.pmovsxw.q.128(<8 x i16> %x0, <2 x i64> %x1, i8 %x2)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxw.q.128(<8 x i16> %x0, <2 x i64> zeroinitializer, i8 %x2)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxw.q.128(<8 x i16> %x0, <2 x i64> zeroinitializer, i8 %x2)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxw.q.128(<8 x i16> %x0, <2 x i64> %x1, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.pmovsxw.q.128(<8 x i16> %x0, <2 x i64> %x1, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pmovsxw.q.256(<8 x i16>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pmovsxw.q.256(<8 x i16>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pmovsxw_q_256(<8 x i16> %x0, <4 x i64> %x1, i8 %x2) {		define <4 x i64>@test_int_x86_avx512_mask_pmovsxw_q_256(<8 x i16> %x0, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxw_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovsxw_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmovsxwq %xmm0, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x24,0xd0]		; CHECK-NEXT: vpmovsxwq %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x24,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsxwq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x24,0xc8]		; CHECK-NEXT: vpmovsxwq %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x24,0xc8]
; CHECK-NEXT: vpmovsxwq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x24,0xc0]		; CHECK-NEXT: vpmovsxwq %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x24,0xc0]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc2]		; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pmovsxw.q.256(<8 x i16> %x0, <4 x i64> %x1, i8 %x2)		%res = call <4 x i64> @llvm.x86.avx512.mask.pmovsxw.q.256(<8 x i16> %x0, <4 x i64> %x1, i8 %x2)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxw.q.256(<8 x i16> %x0, <4 x i64> zeroinitializer, i8 %x2)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxw.q.256(<8 x i16> %x0, <4 x i64> zeroinitializer, i8 %x2)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxw.q.256(<8 x i16> %x0, <4 x i64> %x1, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.pmovsxw.q.256(<8 x i16> %x0, <4 x i64> %x1, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.psra.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.psra.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_psra_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_psra_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0xe2,0xd9]		; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0xe2,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xe2,0xd1]		; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xe2,0xd1]
; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0xe2,0xc1]		; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0xe2,0xc1]
; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.psra.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.psra.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.psra.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.psra.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.psra.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.psra.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.psra.q.256(<4 x i64>, <2 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.psra.q.256(<4 x i64>, <2 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_psra_q_256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_psra_q_256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraq %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0xe2,0xd9]		; CHECK-NEXT: vpsraq %xmm1, %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0xe2,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsraq %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xe2,0xd1]		; CHECK-NEXT: vpsraq %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xe2,0xd1]
; CHECK-NEXT: vpsraq %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0xe2,0xc1]		; CHECK-NEXT: vpsraq %xmm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0xe2,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc3]		; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.psra.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.psra.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.psra.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.psra.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.psra.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.psra.q.256(<4 x i64> %x0, <2 x i64> %x1, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.psra.qi.128(<2 x i64>, i32, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.psra.qi.128(<2 x i64>, i32, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_psra_qi_128(<2 x i64> %x0, i32 %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_psra_qi_128(<2 x i64> %x0, i32 %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_qi_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_qi_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraq $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0xed,0x08,0x72,0xe0,0x03]		; CHECK-NEXT: vpsraq $3, %xmm0, %xmm2 ## encoding: [0x62,0xf1,0xed,0x08,0x72,0xe0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsraq $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x72,0xe0,0x03]		; CHECK-NEXT: vpsraq $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x72,0xe0,0x03]
; CHECK-NEXT: vpsraq $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x72,0xe0,0x03]		; CHECK-NEXT: vpsraq $3, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x72,0xe0,0x03]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc2]		; CHECK-NEXT: vpaddq %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.psra.qi.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.psra.qi.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.psra.qi.128(<2 x i64> %x0, i32 3, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.psra.qi.128(<2 x i64> %x0, i32 3, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.psra.qi.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.psra.qi.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.psra.qi.256(<4 x i64>, i32, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.psra.qi.256(<4 x i64>, i32, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_psra_qi_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_psra_qi_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psra_qi_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psra_qi_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraq $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0xed,0x28,0x72,0xe0,0x03]		; CHECK-NEXT: vpsraq $3, %ymm0, %ymm2 ## encoding: [0x62,0xf1,0xed,0x28,0x72,0xe0,0x03]
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpsraq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x72,0xe0,0x03]		; CHECK-NEXT: vpsraq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x72,0xe0,0x03]
; CHECK-NEXT: vpsraq $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x72,0xe0,0x03]		; CHECK-NEXT: vpsraq $3, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x72,0xe0,0x03]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc2]		; CHECK-NEXT: vpaddq %ymm2, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.psra.qi.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.psra.qi.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.psra.qi.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.psra.qi.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.psra.qi.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.psra.qi.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_psrav_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_psrav_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrav_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_psrav_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsravq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x46,0xd9]		; CHECK-NEXT: vpsravq %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x46,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsravq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x46,0xd1]		; CHECK-NEXT: vpsravq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x46,0xd1]
; CHECK-NEXT: vpsravq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x46,0xc1]		; CHECK-NEXT: vpsravq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x46,0xc1]
; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

define <2 x i64>@test_int_x86_avx512_mask_psrav_q_128_const(i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_psrav_q_128_const(i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrav_q_128_const:		; CHECK-LABEL: test_int_x86_avx512_mask_psrav_q_128_const:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmovdqa64 {{.*#+}} xmm0 = [2,18446744073709551607]		; CHECK-NEXT: vmovdqa {{.*}}(%rip), %xmm0 ## EVEX TO VEX Compression xmm0 = [2,18446744073709551607]
; CHECK-NEXT: ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0x05,A,A,A,A]		; CHECK-NEXT: ## encoding: [0xc5,0xf9,0x6f,0x05,A,A,A,A]
; CHECK-NEXT: ## fixup A - offset: 6, value: LCPI304_0-4, kind: reloc_riprel_4byte		; CHECK-NEXT: ## fixup A - offset: 4, value: LCPI304_0-4, kind: reloc_riprel_4byte
; CHECK-NEXT: vpsravq {{.*}}(%rip), %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x46,0x05,A,A,A,A]		; CHECK-NEXT: vpsravq {{.*}}(%rip), %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x46,0x05,A,A,A,A]
; CHECK-NEXT: ## fixup A - offset: 6, value: LCPI304_1-4, kind: reloc_riprel_4byte		; CHECK-NEXT: ## fixup A - offset: 6, value: LCPI304_1-4, kind: reloc_riprel_4byte
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64> <i64 2, i64 -9>, <2 x i64> <i64 1, i64 90>, <2 x i64> zeroinitializer, i8 -1)		%res = call <2 x i64> @llvm.x86.avx512.mask.psrav.q.128(<2 x i64> <i64 2, i64 -9>, <2 x i64> <i64 1, i64 90>, <2 x i64> zeroinitializer, i8 -1)
ret <2 x i64> %res		ret <2 x i64> %res
}		}

declare <4 x i64> @llvm.x86.avx512.mask.psrav.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.psrav.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_psrav_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_psrav_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_psrav_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_psrav_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsravq %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x46,0xd9]		; CHECK-NEXT: vpsravq %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x46,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsravq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x46,0xd1]		; CHECK-NEXT: vpsravq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x46,0xd1]
; CHECK-NEXT: vpsravq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x46,0xc1]		; CHECK-NEXT: vpsravq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x46,0xc1]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xd4,0xc3]		; CHECK-NEXT: vpaddq %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.psrav.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.psrav.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.psrav.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.psrav.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.psrav.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.psrav.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <2 x double> @llvm.x86.avx512.mask.cvtdq2pd.128(<4 x i32>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.cvtdq2pd.128(<4 x i32>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_cvt_dq2pd_128(<4 x i32> %x0, <2 x double> %x1, i8 %x2) {		define <2 x double>@test_int_x86_avx512_mask_cvt_dq2pd_128(<4 x i32> %x0, <2 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_dq2pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_dq2pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vcvtdq2pd %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x7e,0x08,0xe6,0xd0]		; CHECK-NEXT: vcvtdq2pd %xmm0, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0xe6,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtdq2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0xe6,0xc8]		; CHECK-NEXT: vcvtdq2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0xe6,0xc8]
; CHECK-NEXT: vaddpd %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc2]		; CHECK-NEXT: vaddpd %xmm2, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.cvtdq2pd.128(<4 x i32> %x0, <2 x double> %x1, i8 %x2)		%res = call <2 x double> @llvm.x86.avx512.mask.cvtdq2pd.128(<4 x i32> %x0, <2 x double> %x1, i8 %x2)
%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtdq2pd.128(<4 x i32> %x0, <2 x double> %x1, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtdq2pd.128(<4 x i32> %x0, <2 x double> %x1, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.cvtdq2pd.256(<4 x i32>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.cvtdq2pd.256(<4 x i32>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_cvt_dq2pd_256(<4 x i32> %x0, <4 x double> %x1, i8 %x2) {		define <4 x double>@test_int_x86_avx512_mask_cvt_dq2pd_256(<4 x i32> %x0, <4 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_dq2pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_dq2pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vcvtdq2pd %xmm0, %ymm2 ## encoding: [0x62,0xf1,0x7e,0x28,0xe6,0xd0]		; CHECK-NEXT: vcvtdq2pd %xmm0, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0xe6,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtdq2pd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0xe6,0xc8]		; CHECK-NEXT: vcvtdq2pd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0xe6,0xc8]
; CHECK-NEXT: vaddpd %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc2]		; CHECK-NEXT: vaddpd %ymm2, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.cvtdq2pd.256(<4 x i32> %x0, <4 x double> %x1, i8 %x2)		%res = call <4 x double> @llvm.x86.avx512.mask.cvtdq2pd.256(<4 x i32> %x0, <4 x double> %x1, i8 %x2)
%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtdq2pd.256(<4 x i32> %x0, <4 x double> %x1, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtdq2pd.256(<4 x i32> %x0, <4 x double> %x1, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask.cvtudq2pd.128(<4 x i32>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.cvtudq2pd.128(<4 x i32>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_cvt_udq2pd_128(<4 x i32> %x0, <2 x double> %x1, i8 %x2) {		define <2 x double>@test_int_x86_avx512_mask_cvt_udq2pd_128(<4 x i32> %x0, <2 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_udq2pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_udq2pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vcvtudq2pd %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x7e,0x08,0x7a,0xd0]		; CHECK-NEXT: vcvtudq2pd %xmm0, %xmm2 ## encoding: [0x62,0xf1,0x7e,0x08,0x7a,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtudq2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x7a,0xc8]		; CHECK-NEXT: vcvtudq2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x7a,0xc8]
; CHECK-NEXT: vaddpd %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc2]		; CHECK-NEXT: vaddpd %xmm2, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.cvtudq2pd.128(<4 x i32> %x0, <2 x double> %x1, i8 %x2)		%res = call <2 x double> @llvm.x86.avx512.mask.cvtudq2pd.128(<4 x i32> %x0, <2 x double> %x1, i8 %x2)
%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtudq2pd.128(<4 x i32> %x0, <2 x double> %x1, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtudq2pd.128(<4 x i32> %x0, <2 x double> %x1, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.cvtudq2pd.256(<4 x i32>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.cvtudq2pd.256(<4 x i32>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_cvt_udq2pd_256(<4 x i32> %x0, <4 x double> %x1, i8 %x2) {		define <4 x double>@test_int_x86_avx512_mask_cvt_udq2pd_256(<4 x i32> %x0, <4 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_udq2pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_udq2pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vcvtudq2pd %xmm0, %ymm2 ## encoding: [0x62,0xf1,0x7e,0x28,0x7a,0xd0]		; CHECK-NEXT: vcvtudq2pd %xmm0, %ymm2 ## encoding: [0x62,0xf1,0x7e,0x28,0x7a,0xd0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtudq2pd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x7a,0xc8]		; CHECK-NEXT: vcvtudq2pd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x7a,0xc8]
; CHECK-NEXT: vaddpd %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc2]		; CHECK-NEXT: vaddpd %ymm2, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.cvtudq2pd.256(<4 x i32> %x0, <4 x double> %x1, i8 %x2)		%res = call <4 x double> @llvm.x86.avx512.mask.cvtudq2pd.256(<4 x i32> %x0, <4 x double> %x1, i8 %x2)
%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtudq2pd.256(<4 x i32> %x0, <4 x double> %x1, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtudq2pd.256(<4 x i32> %x0, <4 x double> %x1, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.valign.d.128(<4 x i32>, <4 x i32>, i32, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.valign.d.128(<4 x i32>, <4 x i32>, i32, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_valign_d_128(<4 x i32> %x0, <4 x i32> %x1,<4 x i32> %x3, i8 %x4) {		define <4 x i32>@test_int_x86_avx512_mask_valign_d_128(<4 x i32> %x0, <4 x i32> %x1,<4 x i32> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_valign_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_valign_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: valignd $2, %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf3,0x7d,0x08,0x03,0xd9,0x02]		; CHECK-NEXT: valignd $2, %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf3,0x7d,0x08,0x03,0xd9,0x02]
; CHECK-NEXT: ## xmm3 = xmm1[2,3],xmm0[0,1]		; CHECK-NEXT: ## xmm3 = xmm1[2,3],xmm0[0,1]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: valignd $2, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x03,0xd1,0x02]		; CHECK-NEXT: valignd $2, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x03,0xd1,0x02]
; CHECK-NEXT: ## xmm2 {%k1} = xmm1[2,3],xmm0[0,1]		; CHECK-NEXT: ## xmm2 {%k1} = xmm1[2,3],xmm0[0,1]
; CHECK-NEXT: valignd $2, %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x03,0xc1,0x02]		; CHECK-NEXT: valignd $2, %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x03,0xc1,0x02]
; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm1[2,3],xmm0[0,1]		; CHECK-NEXT: ## xmm0 {%k1} {z} = xmm1[2,3],xmm0[0,1]
; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xcb]		; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xcb]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.valign.d.128(<4 x i32> %x0, <4 x i32> %x1, i32 2, <4 x i32> %x3, i8 %x4)		%res = call <4 x i32> @llvm.x86.avx512.mask.valign.d.128(<4 x i32> %x0, <4 x i32> %x1, i32 2, <4 x i32> %x3, i8 %x4)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.valign.d.128(<4 x i32> %x0, <4 x i32> %x1, i32 2, <4 x i32> %x3, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.valign.d.128(<4 x i32> %x0, <4 x i32> %x1, i32 2, <4 x i32> %x3, i8 -1)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.valign.d.128(<4 x i32> %x0, <4 x i32> %x1, i32 2, <4 x i32> zeroinitializer,i8 %x4)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.valign.d.128(<4 x i32> %x0, <4 x i32> %x1, i32 2, <4 x i32> zeroinitializer,i8 %x4)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.valign.d.256(<8 x i32>, <8 x i32>, i32, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.valign.d.256(<8 x i32>, <8 x i32>, i32, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_valign_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x3, i8 %x4) {		define <8 x i32>@test_int_x86_avx512_mask_valign_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_valign_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_valign_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: valignd $6, %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf3,0x7d,0x28,0x03,0xd9,0x06]		; CHECK-NEXT: valignd $6, %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf3,0x7d,0x28,0x03,0xd9,0x06]
; CHECK-NEXT: ## ymm3 = ymm1[6,7],ymm0[0,1,2,3,4,5]		; CHECK-NEXT: ## ymm3 = ymm1[6,7],ymm0[0,1,2,3,4,5]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: valignd $6, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x03,0xd1,0x06]		; CHECK-NEXT: valignd $6, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x03,0xd1,0x06]
; CHECK-NEXT: ## ymm2 {%k1} = ymm1[6,7],ymm0[0,1,2,3,4,5]		; CHECK-NEXT: ## ymm2 {%k1} = ymm1[6,7],ymm0[0,1,2,3,4,5]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc3]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.valign.d.256(<8 x i32> %x0, <8 x i32> %x1, i32 6, <8 x i32> %x3, i8 %x4)		%res = call <8 x i32> @llvm.x86.avx512.mask.valign.d.256(<8 x i32> %x0, <8 x i32> %x1, i32 6, <8 x i32> %x3, i8 %x4)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.valign.d.256(<8 x i32> %x0, <8 x i32> %x1, i32 6, <8 x i32> %x3, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.valign.d.256(<8 x i32> %x0, <8 x i32> %x1, i32 6, <8 x i32> %x3, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.valign.q.128(<2 x i64>, <2 x i64>, i32, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.valign.q.128(<2 x i64>, <2 x i64>, i32, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_valign_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x3, i8 %x4) {		define <2 x i64>@test_int_x86_avx512_mask_valign_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_valign_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_valign_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: valignq $1, %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf3,0xfd,0x08,0x03,0xd9,0x01]		; CHECK-NEXT: valignq $1, %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf3,0xfd,0x08,0x03,0xd9,0x01]
; CHECK-NEXT: ## xmm3 = xmm1[1],xmm0[0]		; CHECK-NEXT: ## xmm3 = xmm1[1],xmm0[0]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: valignq $1, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x03,0xd1,0x01]		; CHECK-NEXT: valignq $1, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x03,0xd1,0x01]
; CHECK-NEXT: ## xmm2 {%k1} = xmm1[1],xmm0[0]		; CHECK-NEXT: ## xmm2 {%k1} = xmm1[1],xmm0[0]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc3]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.valign.q.128(<2 x i64> %x0, <2 x i64> %x1, i32 1, <2 x i64> %x3, i8 %x4)		%res = call <2 x i64> @llvm.x86.avx512.mask.valign.q.128(<2 x i64> %x0, <2 x i64> %x1, i32 1, <2 x i64> %x3, i8 %x4)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.valign.q.128(<2 x i64> %x0, <2 x i64> %x1, i32 1, <2 x i64> %x3, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.valign.q.128(<2 x i64> %x0, <2 x i64> %x1, i32 1, <2 x i64> %x3, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.valign.q.256(<4 x i64>, <4 x i64>, i32, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.valign.q.256(<4 x i64>, <4 x i64>, i32, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_valign_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x3, i8 %x4) {		define <4 x i64>@test_int_x86_avx512_mask_valign_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_valign_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_valign_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: valignq $3, %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf3,0xfd,0x28,0x03,0xd9,0x03]		; CHECK-NEXT: valignq $3, %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf3,0xfd,0x28,0x03,0xd9,0x03]
; CHECK-NEXT: ## ymm3 = ymm1[3],ymm0[0,1,2]		; CHECK-NEXT: ## ymm3 = ymm1[3],ymm0[0,1,2]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: valignq $3, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x03,0xd1,0x03]		; CHECK-NEXT: valignq $3, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x03,0xd1,0x03]
; CHECK-NEXT: ## ymm2 {%k1} = ymm1[3],ymm0[0,1,2]		; CHECK-NEXT: ## ymm2 {%k1} = ymm1[3],ymm0[0,1,2]
; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc3]		; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.valign.q.256(<4 x i64> %x0, <4 x i64> %x1, i32 3, <4 x i64> %x3, i8 %x4)		%res = call <4 x i64> @llvm.x86.avx512.mask.valign.q.256(<4 x i64> %x0, <4 x i64> %x1, i32 3, <4 x i64> %x3, i8 %x4)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.valign.q.256(<4 x i64> %x0, <4 x i64> %x1, i32 3, <4 x i64> %x3, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.valign.q.256(<4 x i64> %x0, <4 x i64> %x1, i32 3, <4 x i64> %x3, i8 -1)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.vpermilvar.pd.256(<4 x double>, <4 x i64>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.vpermilvar.pd.256(<4 x double>, <4 x i64>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_vpermilvar_pd_256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_vpermilvar_pd_256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermilvar_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermilvar_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermilpd %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0xfd,0x28,0x0d,0xd9]		; CHECK-NEXT: vpermilpd %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x0d,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermilpd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x0d,0xd1]		; CHECK-NEXT: vpermilpd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x0d,0xd1]
; CHECK-NEXT: vpermilpd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x0d,0xc1]		; CHECK-NEXT: vpermilpd %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x0d,0xc1]
; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc0]
; CHECK-NEXT: vaddpd %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vpermilvar.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.vpermilvar.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.vpermilvar.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> zeroinitializer, i8 %x3)		%res1 = call <4 x double> @llvm.x86.avx512.mask.vpermilvar.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> zeroinitializer, i8 %x3)
%res2 = call <4 x double> @llvm.x86.avx512.mask.vpermilvar.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 -1)		%res2 = call <4 x double> @llvm.x86.avx512.mask.vpermilvar.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 -1)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res2, %res3		%res4 = fadd <4 x double> %res2, %res3
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <2 x double> @llvm.x86.avx512.mask.vpermilvar.pd.128(<2 x double>, <2 x i64>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.vpermilvar.pd.128(<2 x double>, <2 x i64>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_vpermilvar_pd_128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_vpermilvar_pd_128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermilvar_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermilvar_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermilpd %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0xfd,0x08,0x0d,0xd9]		; CHECK-NEXT: vpermilpd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x0d,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermilpd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x0d,0xd1]		; CHECK-NEXT: vpermilpd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x0d,0xd1]
; CHECK-NEXT: vpermilpd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x0d,0xc1]		; CHECK-NEXT: vpermilpd %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x0d,0xc1]
; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x58,0xc0]
; CHECK-NEXT: vaddpd %xmm3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x58,0xc3]		; CHECK-NEXT: vaddpd %xmm3, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vpermilvar.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.vpermilvar.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.vpermilvar.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> zeroinitializer, i8 %x3)		%res1 = call <2 x double> @llvm.x86.avx512.mask.vpermilvar.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> zeroinitializer, i8 %x3)
%res2 = call <2 x double> @llvm.x86.avx512.mask.vpermilvar.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 -1)		%res2 = call <2 x double> @llvm.x86.avx512.mask.vpermilvar.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 -1)
%res3 = fadd <2 x double> %res, %res1		%res3 = fadd <2 x double> %res, %res1
%res4 = fadd <2 x double> %res3, %res2		%res4 = fadd <2 x double> %res3, %res2
ret <2 x double> %res4		ret <2 x double> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.vpermilvar.ps.256(<8 x float>, <8 x i32>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.vpermilvar.ps.256(<8 x float>, <8 x i32>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_vpermilvar_ps_256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_vpermilvar_ps_256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermilvar_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermilvar_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermilps %ymm1, %ymm0, %ymm3 ## encoding: [0x62,0xf2,0x7d,0x28,0x0c,0xd9]		; CHECK-NEXT: vpermilps %ymm1, %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x0c,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermilps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x0c,0xd1]		; CHECK-NEXT: vpermilps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x0c,0xd1]
; CHECK-NEXT: vpermilps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x0c,0xc1]		; CHECK-NEXT: vpermilps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x0c,0xc1]
; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc0]
; CHECK-NEXT: vaddps %ymm3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc3]		; CHECK-NEXT: vaddps %ymm3, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc3]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vpermilvar.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.vpermilvar.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.vpermilvar.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> zeroinitializer, i8 %x3)		%res1 = call <8 x float> @llvm.x86.avx512.mask.vpermilvar.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> zeroinitializer, i8 %x3)
%res2 = call <8 x float> @llvm.x86.avx512.mask.vpermilvar.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 -1)		%res2 = call <8 x float> @llvm.x86.avx512.mask.vpermilvar.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 -1)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res3, %res2		%res4 = fadd <8 x float> %res3, %res2
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <4 x float> @llvm.x86.avx512.mask.vpermilvar.ps.128(<4 x float>, <4 x i32>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.vpermilvar.ps.128(<4 x float>, <4 x i32>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_vpermilvar_ps_128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vpermilvar_ps_128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermilvar_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermilvar_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpermilps %xmm1, %xmm0, %xmm3 ## encoding: [0x62,0xf2,0x7d,0x08,0x0c,0xd9]		; CHECK-NEXT: vpermilps %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x0c,0xd9]
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermilps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x0c,0xd1]		; CHECK-NEXT: vpermilps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x0c,0xd1]
; CHECK-NEXT: vpermilps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x0c,0xc1]		; CHECK-NEXT: vpermilps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x0c,0xc1]
; CHECK-NEXT: vaddps %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6c,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe8,0x58,0xc0]
; CHECK-NEXT: vaddps %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vpermilvar.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vpermilvar.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.vpermilvar.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> zeroinitializer, i8 %x3)		%res1 = call <4 x float> @llvm.x86.avx512.mask.vpermilvar.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> zeroinitializer, i8 %x3)
%res2 = call <4 x float> @llvm.x86.avx512.mask.vpermilvar.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 -1)		%res2 = call <4 x float> @llvm.x86.avx512.mask.vpermilvar.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 -1)
%res3 = fadd <4 x float> %res, %res1		%res3 = fadd <4 x float> %res, %res1
%res4 = fadd <4 x float> %res2, %res3		%res4 = fadd <4 x float> %res2, %res3
ret <4 x float> %res4		ret <4 x float> %res4
}		}

llvm/trunk/test/CodeGen/X86/avx512vl-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 828 Lines • ▼ Show 20 Lines

declare <8 x double> @llvm.x86.avx512.mask.compress.pd.512(<8 x double> %data, <8 x double> %src0, i8 %mask)		declare <8 x double> @llvm.x86.avx512.mask.compress.pd.512(<8 x double> %data, <8 x double> %src0, i8 %mask)

define <4 x double> @compr5(<4 x double> %data, <4 x double> %src0, i8 %mask) {		define <4 x double> @compr5(<4 x double> %data, <4 x double> %src0, i8 %mask) {
; CHECK-LABEL: compr5:		; CHECK-LABEL: compr5:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcompresspd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x8a,0xc1]		; CHECK-NEXT: vcompresspd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x8a,0xc1]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.compress.pd.256( <4 x double> %data, <4 x double> %src0, i8 %mask)		%res = call <4 x double> @llvm.x86.avx512.mask.compress.pd.256( <4 x double> %data, <4 x double> %src0, i8 %mask)
ret <4 x double> %res		ret <4 x double> %res
}		}

declare <4 x double> @llvm.x86.avx512.mask.compress.pd.256(<4 x double> %data, <4 x double> %src0, i8 %mask)		declare <4 x double> @llvm.x86.avx512.mask.compress.pd.256(<4 x double> %data, <4 x double> %src0, i8 %mask)

define <4 x float> @compr6(<4 x float> %data, i8 %mask) {		define <4 x float> @compr6(<4 x float> %data, i8 %mask) {
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
@xmm = common global <4 x i32> zeroinitializer, align 16		@xmm = common global <4 x i32> zeroinitializer, align 16
@k8 = common global i8 0, align 1		@k8 = common global i8 0, align 1

define i32 @compr11() {		define i32 @compr11() {
; CHECK-LABEL: compr11:		; CHECK-LABEL: compr11:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: movq _xmm@{{.*}}(%rip), %rax ## encoding: [0x48,0x8b,0x05,A,A,A,A]		; CHECK-NEXT: movq _xmm@{{.*}}(%rip), %rax ## encoding: [0x48,0x8b,0x05,A,A,A,A]
; CHECK-NEXT: ## fixup A - offset: 3, value: _xmm@GOTPCREL-4, kind: reloc_riprel_4byte_movq_load		; CHECK-NEXT: ## fixup A - offset: 3, value: _xmm@GOTPCREL-4, kind: reloc_riprel_4byte_movq_load
; CHECK-NEXT: vmovdqa32 (%rax), %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6f,0x00]		; CHECK-NEXT: vmovdqa (%rax), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0x00]
; CHECK-NEXT: movq _k8@{{.*}}(%rip), %rax ## encoding: [0x48,0x8b,0x05,A,A,A,A]		; CHECK-NEXT: movq _k8@{{.*}}(%rip), %rax ## encoding: [0x48,0x8b,0x05,A,A,A,A]
; CHECK-NEXT: ## fixup A - offset: 3, value: _k8@GOTPCREL-4, kind: reloc_riprel_4byte_movq_load		; CHECK-NEXT: ## fixup A - offset: 3, value: _k8@GOTPCREL-4, kind: reloc_riprel_4byte_movq_load
; CHECK-NEXT: movzbl (%rax), %eax ## encoding: [0x0f,0xb6,0x00]		; CHECK-NEXT: movzbl (%rax), %eax ## encoding: [0x0f,0xb6,0x00]
; CHECK-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]		; CHECK-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]
; CHECK-NEXT: vpcompressd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x8b,0xc0]		; CHECK-NEXT: vpcompressd %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x8b,0xc0]
; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]		; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
; CHECK-NEXT: vmovdqa32 %xmm0, -{{[0-9]+}}(%rsp) ## encoding: [0x62,0xf1,0x7d,0x08,0x7f,0x84,0x24,0xd8,0xff,0xff,0xff]		; CHECK-NEXT: vmovdqa %xmm0, -{{[0-9]+}}(%rsp) ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x7f,0x44,0x24,0xd8]
; CHECK-NEXT: vmovdqa32 %xmm1, -{{[0-9]+}}(%rsp) ## encoding: [0x62,0xf1,0x7d,0x08,0x7f,0x8c,0x24,0xe8,0xff,0xff,0xff]		; CHECK-NEXT: vmovdqa %xmm1, -{{[0-9]+}}(%rsp) ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x7f,0x4c,0x24,0xe8]
; CHECK-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]		; CHECK-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
entry:		entry:
%.compoundliteral = alloca <2 x i64>, align 16		%.compoundliteral = alloca <2 x i64>, align 16
%res = alloca <4 x i32>, align 16		%res = alloca <4 x i32>, align 16
%a0 = load <4 x i32>, <4 x i32>* @xmm, align 16		%a0 = load <4 x i32>, <4 x i32>* @xmm, align 16
%a2 = load i8, i8* @k8, align 1		%a2 = load i8, i8* @k8, align 1
%a21 = call <4 x i32> @llvm.x86.avx512.mask.compress.d.128(<4 x i32> %a0, <4 x i32> zeroinitializer, i8 %a2) #2		%a21 = call <4 x i32> @llvm.x86.avx512.mask.compress.d.128(<4 x i32> %a0, <4 x i32> zeroinitializer, i8 %a2) #2
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines

declare <8 x double> @llvm.x86.avx512.mask.expand.pd.512(<8 x double> %data, <8 x double> %src0, i8 %mask)		declare <8 x double> @llvm.x86.avx512.mask.expand.pd.512(<8 x double> %data, <8 x double> %src0, i8 %mask)

define <4 x double> @expand5(<4 x double> %data, <4 x double> %src0, i8 %mask) {		define <4 x double> @expand5(<4 x double> %data, <4 x double> %src0, i8 %mask) {
; CHECK-LABEL: expand5:		; CHECK-LABEL: expand5:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vexpandpd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x88,0xc8]		; CHECK-NEXT: vexpandpd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x88,0xc8]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.expand.pd.256( <4 x double> %data, <4 x double> %src0, i8 %mask)		%res = call <4 x double> @llvm.x86.avx512.mask.expand.pd.256( <4 x double> %data, <4 x double> %src0, i8 %mask)
ret <4 x double> %res		ret <4 x double> %res
}		}

declare <4 x double> @llvm.x86.avx512.mask.expand.pd.256(<4 x double> %data, <4 x double> %src0, i8 %mask)		declare <4 x double> @llvm.x86.avx512.mask.expand.pd.256(<4 x double> %data, <4 x double> %src0, i8 %mask)

define <4 x float> @expand6(<4 x float> %data, i8 %mask) {		define <4 x float> @expand6(<4 x float> %data, i8 %mask) {
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x i64> %res		ret <8 x i64> %res
}		}

declare <8 x i64> @llvm.x86.avx512.mask.expand.q.512(<8 x i64> , <8 x i64>, i8)		declare <8 x i64> @llvm.x86.avx512.mask.expand.q.512(<8 x i64> , <8 x i64>, i8)

define < 2 x i64> @test_mask_mul_epi32_rr_128(< 4 x i32> %a, < 4 x i32> %b) {		define < 2 x i64> @test_mask_mul_epi32_rr_128(< 4 x i32> %a, < 4 x i32> %b) {
; CHECK-LABEL: test_mask_mul_epi32_rr_128:		; CHECK-LABEL: test_mask_mul_epi32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x28,0xc1]		; CHECK-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 -1)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 -1)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epi32_rrk_128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask) {		define < 2 x i64> @test_mask_mul_epi32_rrk_128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rrk_128:		; CHECK-LABEL: test_mask_mul_epi32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmuldq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x28,0xd1]		; CHECK-NEXT: vpmuldq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x28,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epi32_rrkz_128(< 4 x i32> %a, < 4 x i32> %b, i8 %mask) {		define < 2 x i64> @test_mask_mul_epi32_rrkz_128(< 4 x i32> %a, < 4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rrkz_128:		; CHECK-LABEL: test_mask_mul_epi32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x28,0xc1]		; CHECK-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 %mask)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 %mask)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epi32_rm_128(< 4 x i32> %a, < 4 x i32>* %ptr_b) {		define < 2 x i64> @test_mask_mul_epi32_rm_128(< 4 x i32> %a, < 4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_mul_epi32_rm_128:		; CHECK-LABEL: test_mask_mul_epi32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmuldq (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x28,0x07]		; CHECK-NEXT: vpmuldq (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x28,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load < 4 x i32>, < 4 x i32>* %ptr_b		%b = load < 4 x i32>, < 4 x i32>* %ptr_b
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 -1)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 -1)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epi32_rmk_128(< 4 x i32> %a, < 4 x i32>* %ptr_b, < 2 x i64> %passThru, i8 %mask) {		define < 2 x i64> @test_mask_mul_epi32_rmk_128(< 4 x i32> %a, < 4 x i32>* %ptr_b, < 2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rmk_128:		; CHECK-LABEL: test_mask_mul_epi32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmuldq (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x28,0x0f]		; CHECK-NEXT: vpmuldq (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x28,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load < 4 x i32>, < 4 x i32>* %ptr_b		%b = load < 4 x i32>, < 4 x i32>* %ptr_b
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epi32_rmkz_128(< 4 x i32> %a, < 4 x i32>* %ptr_b, i8 %mask) {		define < 2 x i64> @test_mask_mul_epi32_rmkz_128(< 4 x i32> %a, < 4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rmkz_128:		; CHECK-LABEL: test_mask_mul_epi32_rmkz_128:
Show All 19 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epi32_rmbk_128(< 4 x i32> %a, i64* %ptr_b, < 2 x i64> %passThru, i8 %mask) {		define < 2 x i64> @test_mask_mul_epi32_rmbk_128(< 4 x i32> %a, i64* %ptr_b, < 2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rmbk_128:		; CHECK-LABEL: test_mask_mul_epi32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmuldq (%rdi){1to2}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x19,0x28,0x0f]		; CHECK-NEXT: vpmuldq (%rdi){1to2}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x19,0x28,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i64, i64* %ptr_b		%q = load i64, i64* %ptr_b
%vecinit.i = insertelement < 2 x i64> undef, i64 %q, i32 0		%vecinit.i = insertelement < 2 x i64> undef, i64 %q, i32 0
%b64 = shufflevector < 2 x i64> %vecinit.i, < 2 x i64> undef, <2 x i32> zeroinitializer		%b64 = shufflevector < 2 x i64> %vecinit.i, < 2 x i64> undef, <2 x i32> zeroinitializer
%b = bitcast < 2 x i64> %b64 to < 4 x i32>		%b = bitcast < 2 x i64> %b64 to < 4 x i32>
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}
Show All 12 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

declare < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32>, < 4 x i32>, < 2 x i64>, i8)		declare < 2 x i64> @llvm.x86.avx512.mask.pmul.dq.128(< 4 x i32>, < 4 x i32>, < 2 x i64>, i8)

define < 4 x i64> @test_mask_mul_epi32_rr_256(< 8 x i32> %a, < 8 x i32> %b) {		define < 4 x i64> @test_mask_mul_epi32_rr_256(< 8 x i32> %a, < 8 x i32> %b) {
; CHECK-LABEL: test_mask_mul_epi32_rr_256:		; CHECK-LABEL: test_mask_mul_epi32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmuldq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x28,0xc1]		; CHECK-NEXT: vpmuldq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 -1)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 -1)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epi32_rrk_256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask) {		define < 4 x i64> @test_mask_mul_epi32_rrk_256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rrk_256:		; CHECK-LABEL: test_mask_mul_epi32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmuldq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x28,0xd1]		; CHECK-NEXT: vpmuldq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x28,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epi32_rrkz_256(< 8 x i32> %a, < 8 x i32> %b, i8 %mask) {		define < 4 x i64> @test_mask_mul_epi32_rrkz_256(< 8 x i32> %a, < 8 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rrkz_256:		; CHECK-LABEL: test_mask_mul_epi32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmuldq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x28,0xc1]		; CHECK-NEXT: vpmuldq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 %mask)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 %mask)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epi32_rm_256(< 8 x i32> %a, < 8 x i32>* %ptr_b) {		define < 4 x i64> @test_mask_mul_epi32_rm_256(< 8 x i32> %a, < 8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_mul_epi32_rm_256:		; CHECK-LABEL: test_mask_mul_epi32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmuldq (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x28,0x07]		; CHECK-NEXT: vpmuldq (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x28,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load < 8 x i32>, < 8 x i32>* %ptr_b		%b = load < 8 x i32>, < 8 x i32>* %ptr_b
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 -1)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 -1)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epi32_rmk_256(< 8 x i32> %a, < 8 x i32>* %ptr_b, < 4 x i64> %passThru, i8 %mask) {		define < 4 x i64> @test_mask_mul_epi32_rmk_256(< 8 x i32> %a, < 8 x i32>* %ptr_b, < 4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rmk_256:		; CHECK-LABEL: test_mask_mul_epi32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmuldq (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x28,0x0f]		; CHECK-NEXT: vpmuldq (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x28,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load < 8 x i32>, < 8 x i32>* %ptr_b		%b = load < 8 x i32>, < 8 x i32>* %ptr_b
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epi32_rmkz_256(< 8 x i32> %a, < 8 x i32>* %ptr_b, i8 %mask) {		define < 4 x i64> @test_mask_mul_epi32_rmkz_256(< 8 x i32> %a, < 8 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rmkz_256:		; CHECK-LABEL: test_mask_mul_epi32_rmkz_256:
Show All 19 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epi32_rmbk_256(< 8 x i32> %a, i64* %ptr_b, < 4 x i64> %passThru, i8 %mask) {		define < 4 x i64> @test_mask_mul_epi32_rmbk_256(< 8 x i32> %a, i64* %ptr_b, < 4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epi32_rmbk_256:		; CHECK-LABEL: test_mask_mul_epi32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmuldq (%rdi){1to4}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x39,0x28,0x0f]		; CHECK-NEXT: vpmuldq (%rdi){1to4}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x39,0x28,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i64, i64* %ptr_b		%q = load i64, i64* %ptr_b
%vecinit.i = insertelement < 4 x i64> undef, i64 %q, i32 0		%vecinit.i = insertelement < 4 x i64> undef, i64 %q, i32 0
%b64 = shufflevector < 4 x i64> %vecinit.i, < 4 x i64> undef, < 4 x i32> zeroinitializer		%b64 = shufflevector < 4 x i64> %vecinit.i, < 4 x i64> undef, < 4 x i32> zeroinitializer
%b = bitcast < 4 x i64> %b64 to < 8 x i32>		%b = bitcast < 4 x i64> %b64 to < 8 x i32>
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}
Show All 12 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

declare < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32>, < 8 x i32>, < 4 x i64>, i8)		declare < 4 x i64> @llvm.x86.avx512.mask.pmul.dq.256(< 8 x i32>, < 8 x i32>, < 4 x i64>, i8)

define < 2 x i64> @test_mask_mul_epu32_rr_128(< 4 x i32> %a, < 4 x i32> %b) {		define < 2 x i64> @test_mask_mul_epu32_rr_128(< 4 x i32> %a, < 4 x i32> %b) {
; CHECK-LABEL: test_mask_mul_epu32_rr_128:		; CHECK-LABEL: test_mask_mul_epu32_rr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xf4,0xc1]		; CHECK-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf4,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 -1)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 -1)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epu32_rrk_128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask) {		define < 2 x i64> @test_mask_mul_epu32_rrk_128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rrk_128:		; CHECK-LABEL: test_mask_mul_epu32_rrk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmuludq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xf4,0xd1]		; CHECK-NEXT: vpmuludq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xf4,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epu32_rrkz_128(< 4 x i32> %a, < 4 x i32> %b, i8 %mask) {		define < 2 x i64> @test_mask_mul_epu32_rrkz_128(< 4 x i32> %a, < 4 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rrkz_128:		; CHECK-LABEL: test_mask_mul_epu32_rrkz_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0xf4,0xc1]		; CHECK-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0xf4,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 %mask)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 %mask)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epu32_rm_128(< 4 x i32> %a, < 4 x i32>* %ptr_b) {		define < 2 x i64> @test_mask_mul_epu32_rm_128(< 4 x i32> %a, < 4 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_mul_epu32_rm_128:		; CHECK-LABEL: test_mask_mul_epu32_rm_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmuludq (%rdi), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xf4,0x07]		; CHECK-NEXT: vpmuludq (%rdi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf4,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load < 4 x i32>, < 4 x i32>* %ptr_b		%b = load < 4 x i32>, < 4 x i32>* %ptr_b
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 -1)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> zeroinitializer, i8 -1)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epu32_rmk_128(< 4 x i32> %a, < 4 x i32>* %ptr_b, < 2 x i64> %passThru, i8 %mask) {		define < 2 x i64> @test_mask_mul_epu32_rmk_128(< 4 x i32> %a, < 4 x i32>* %ptr_b, < 2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rmk_128:		; CHECK-LABEL: test_mask_mul_epu32_rmk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmuludq (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xf4,0x0f]		; CHECK-NEXT: vpmuludq (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xf4,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load < 4 x i32>, < 4 x i32>* %ptr_b		%b = load < 4 x i32>, < 4 x i32>* %ptr_b
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epu32_rmkz_128(< 4 x i32> %a, < 4 x i32>* %ptr_b, i8 %mask) {		define < 2 x i64> @test_mask_mul_epu32_rmkz_128(< 4 x i32> %a, < 4 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rmkz_128:		; CHECK-LABEL: test_mask_mul_epu32_rmkz_128:
Show All 19 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

define < 2 x i64> @test_mask_mul_epu32_rmbk_128(< 4 x i32> %a, i64* %ptr_b, < 2 x i64> %passThru, i8 %mask) {		define < 2 x i64> @test_mask_mul_epu32_rmbk_128(< 4 x i32> %a, i64* %ptr_b, < 2 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rmbk_128:		; CHECK-LABEL: test_mask_mul_epu32_rmbk_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmuludq (%rdi){1to2}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x19,0xf4,0x0f]		; CHECK-NEXT: vpmuludq (%rdi){1to2}, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x19,0xf4,0x0f]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i64, i64* %ptr_b		%q = load i64, i64* %ptr_b
%vecinit.i = insertelement < 2 x i64> undef, i64 %q, i32 0		%vecinit.i = insertelement < 2 x i64> undef, i64 %q, i32 0
%b64 = shufflevector < 2 x i64> %vecinit.i, < 2 x i64> undef, <2 x i32> zeroinitializer		%b64 = shufflevector < 2 x i64> %vecinit.i, < 2 x i64> undef, <2 x i32> zeroinitializer
%b = bitcast < 2 x i64> %b64 to < 4 x i32>		%b = bitcast < 2 x i64> %b64 to < 4 x i32>
%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)		%res = call < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32> %a, < 4 x i32> %b, < 2 x i64> %passThru, i8 %mask)
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}
Show All 12 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret < 2 x i64> %res		ret < 2 x i64> %res
}		}

declare < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32>, < 4 x i32>, < 2 x i64>, i8)		declare < 2 x i64> @llvm.x86.avx512.mask.pmulu.dq.128(< 4 x i32>, < 4 x i32>, < 2 x i64>, i8)

define < 4 x i64> @test_mask_mul_epu32_rr_256(< 8 x i32> %a, < 8 x i32> %b) {		define < 4 x i64> @test_mask_mul_epu32_rr_256(< 8 x i32> %a, < 8 x i32> %b) {
; CHECK-LABEL: test_mask_mul_epu32_rr_256:		; CHECK-LABEL: test_mask_mul_epu32_rr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmuludq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xf4,0xc1]		; CHECK-NEXT: vpmuludq %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf4,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 -1)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 -1)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epu32_rrk_256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask) {		define < 4 x i64> @test_mask_mul_epu32_rrk_256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rrk_256:		; CHECK-LABEL: test_mask_mul_epu32_rrk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmuludq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xf4,0xd1]		; CHECK-NEXT: vpmuludq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xf4,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epu32_rrkz_256(< 8 x i32> %a, < 8 x i32> %b, i8 %mask) {		define < 4 x i64> @test_mask_mul_epu32_rrkz_256(< 8 x i32> %a, < 8 x i32> %b, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rrkz_256:		; CHECK-LABEL: test_mask_mul_epu32_rrkz_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmuludq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0xf4,0xc1]		; CHECK-NEXT: vpmuludq %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0xf4,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 %mask)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 %mask)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epu32_rm_256(< 8 x i32> %a, < 8 x i32>* %ptr_b) {		define < 4 x i64> @test_mask_mul_epu32_rm_256(< 8 x i32> %a, < 8 x i32>* %ptr_b) {
; CHECK-LABEL: test_mask_mul_epu32_rm_256:		; CHECK-LABEL: test_mask_mul_epu32_rm_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpmuludq (%rdi), %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xf4,0x07]		; CHECK-NEXT: vpmuludq (%rdi), %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xf4,0x07]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load < 8 x i32>, < 8 x i32>* %ptr_b		%b = load < 8 x i32>, < 8 x i32>* %ptr_b
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 -1)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> zeroinitializer, i8 -1)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epu32_rmk_256(< 8 x i32> %a, < 8 x i32>* %ptr_b, < 4 x i64> %passThru, i8 %mask) {		define < 4 x i64> @test_mask_mul_epu32_rmk_256(< 8 x i32> %a, < 8 x i32>* %ptr_b, < 4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rmk_256:		; CHECK-LABEL: test_mask_mul_epu32_rmk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmuludq (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xf4,0x0f]		; CHECK-NEXT: vpmuludq (%rdi), %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xf4,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%b = load < 8 x i32>, < 8 x i32>* %ptr_b		%b = load < 8 x i32>, < 8 x i32>* %ptr_b
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epu32_rmkz_256(< 8 x i32> %a, < 8 x i32>* %ptr_b, i8 %mask) {		define < 4 x i64> @test_mask_mul_epu32_rmkz_256(< 8 x i32> %a, < 8 x i32>* %ptr_b, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rmkz_256:		; CHECK-LABEL: test_mask_mul_epu32_rmkz_256:
Show All 19 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}

define < 4 x i64> @test_mask_mul_epu32_rmbk_256(< 8 x i32> %a, i64* %ptr_b, < 4 x i64> %passThru, i8 %mask) {		define < 4 x i64> @test_mask_mul_epu32_rmbk_256(< 8 x i32> %a, i64* %ptr_b, < 4 x i64> %passThru, i8 %mask) {
; CHECK-LABEL: test_mask_mul_epu32_rmbk_256:		; CHECK-LABEL: test_mask_mul_epu32_rmbk_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpmuludq (%rdi){1to4}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x39,0xf4,0x0f]		; CHECK-NEXT: vpmuludq (%rdi){1to4}, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x39,0xf4,0x0f]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%q = load i64, i64* %ptr_b		%q = load i64, i64* %ptr_b
%vecinit.i = insertelement < 4 x i64> undef, i64 %q, i32 0		%vecinit.i = insertelement < 4 x i64> undef, i64 %q, i32 0
%b64 = shufflevector < 4 x i64> %vecinit.i, < 4 x i64> undef, < 4 x i32> zeroinitializer		%b64 = shufflevector < 4 x i64> %vecinit.i, < 4 x i64> undef, < 4 x i32> zeroinitializer
%b = bitcast < 4 x i64> %b64 to < 8 x i32>		%b = bitcast < 4 x i64> %b64 to < 8 x i32>
%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)		%res = call < 4 x i64> @llvm.x86.avx512.mask.pmulu.dq.256(< 8 x i32> %a, < 8 x i32> %b, < 4 x i64> %passThru, i8 %mask)
ret < 4 x i64> %res		ret < 4 x i64> %res
}		}
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_mask_max_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {		define <8 x float> @test_mm512_mask_max_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_max_ps_256:		; CHECK-LABEL: test_mm512_mask_max_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmaxps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5f,0xd1]		; CHECK-NEXT: vmaxps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5f,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.max.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.max.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_max_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_max_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_max_ps_256:		; CHECK-LABEL: test_mm512_max_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmaxps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5f,0xc1]		; CHECK-NEXT: vmaxps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.max.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.max.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}
declare <8 x float> @llvm.x86.avx512.mask.max.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.max.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <4 x float> @test_mm512_maskz_max_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_maskz_max_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_max_ps_128:		; CHECK-LABEL: test_mm512_maskz_max_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmaxps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x5f,0xc1]		; CHECK-NEXT: vmaxps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x5f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.max.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.max.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_mask_max_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {		define <4 x float> @test_mm512_mask_max_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_max_ps_128:		; CHECK-LABEL: test_mm512_mask_max_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmaxps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5f,0xd1]		; CHECK-NEXT: vmaxps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5f,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.max.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.max.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_max_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_max_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_max_ps_128:		; CHECK-LABEL: test_mm512_max_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5f,0xc1]		; CHECK-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.max.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.max.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}
declare <4 x float> @llvm.x86.avx512.mask.max.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.max.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <8 x float> @test_mm512_maskz_min_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_maskz_min_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_min_ps_256:		; CHECK-LABEL: test_mm512_maskz_min_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vminps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x5d,0xc1]		; CHECK-NEXT: vminps %ymm1, %ymm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x5d,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.min.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.min.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_mask_min_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {		define <8 x float> @test_mm512_mask_min_ps_256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_min_ps_256:		; CHECK-LABEL: test_mm512_mask_min_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vminps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5d,0xd1]		; CHECK-NEXT: vminps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5d,0xd1]
; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc2]		; CHECK-NEXT: vmovaps %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.min.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.min.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float> %src, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_mm512_min_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_mm512_min_ps_256(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_min_ps_256:		; CHECK-LABEL: test_mm512_min_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vminps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5d,0xc1]		; CHECK-NEXT: vminps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5d,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.min.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.min.ps.256(<8 x float> %a0, <8 x float> %a1, <8 x float>zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}
declare <8 x float> @llvm.x86.avx512.mask.min.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.min.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <4 x float> @test_mm512_maskz_min_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_maskz_min_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_maskz_min_ps_128:		; CHECK-LABEL: test_mm512_maskz_min_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vminps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x5d,0xc1]		; CHECK-NEXT: vminps %xmm1, %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x5d,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.min.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.min.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_mask_min_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {		define <4 x float> @test_mm512_mask_min_ps_128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask) {
; CHECK-LABEL: test_mm512_mask_min_ps_128:		; CHECK-LABEL: test_mm512_mask_min_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vminps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5d,0xd1]		; CHECK-NEXT: vminps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5d,0xd1]
; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc2]		; CHECK-NEXT: vmovaps %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.min.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.min.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float> %src, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_mm512_min_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_mm512_min_ps_128(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_mm512_min_ps_128:		; CHECK-LABEL: test_mm512_min_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vminps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5d,0xc1]		; CHECK-NEXT: vminps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5d,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.min.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.min.ps.128(<4 x float> %a0, <4 x float> %a1, <4 x float>zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}
declare <4 x float> @llvm.x86.avx512.mask.min.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.min.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x double> @test_sqrt_pd_256(<4 x double> %a0, i8 %mask) {		define <4 x double> @test_sqrt_pd_256(<4 x double> %a0, i8 %mask) {
; CHECK-LABEL: test_sqrt_pd_256:		; CHECK-LABEL: test_sqrt_pd_256:
Show All 40 Lines
declare <8 x float> @llvm.x86.avx512.mask.getexp.ps.256(<8 x float>, <8 x float>, i8) nounwind readnone		declare <8 x float> @llvm.x86.avx512.mask.getexp.ps.256(<8 x float>, <8 x float>, i8) nounwind readnone

declare <4 x i32> @llvm.x86.avx512.mask.vpermt2var.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.vpermt2var.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_vpermt2var_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_vpermt2var_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd9]
; CHECK-NEXT: vpermt2d %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7e,0xda]		; CHECK-NEXT: vpermt2d %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7e,0xda]
; CHECK-NEXT: vpermt2d %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x7e,0xca]		; CHECK-NEXT: vpermt2d %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x7e,0xca]
; CHECK-NEXT: vpaddd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.vpermt2var.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.vpermt2var.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.vpermt2var.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.vpermt2var.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_maskz_vpermt2var_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_maskz_vpermt2var_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vpermt2var_d_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_vpermt2var_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd9]
; CHECK-NEXT: vpermt2d %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7e,0xda]		; CHECK-NEXT: vpermt2d %xmm2, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7e,0xda]
; CHECK-NEXT: vpermt2d %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x7e,0xca]		; CHECK-NEXT: vpermt2d %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x7e,0xca]
; CHECK-NEXT: vpaddd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.vpermt2var.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.vpermt2var.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_vpermt2var_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_vpermt2var_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermt2var_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd9]
; CHECK-NEXT: vpermt2d %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7e,0xda]		; CHECK-NEXT: vpermt2d %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7e,0xda]
; CHECK-NEXT: vpermt2d %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x7e,0xca]		; CHECK-NEXT: vpermt2d %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x7e,0xca]
; CHECK-NEXT: vpaddd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfe,0xc1]		; CHECK-NEXT: vpaddd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.vpermt2var.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.vpermt2var.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.vpermt2var.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.vpermt2var.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_maskz_vpermt2var_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_maskz_vpermt2var_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_maskz_vpermt2var_d_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_vpermt2var_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd9]		; CHECK-NEXT: vmovdqa %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd9]
; CHECK-NEXT: vpermt2d %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7e,0xda]		; CHECK-NEXT: vpermt2d %ymm2, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7e,0xda]
; CHECK-NEXT: vpermt2d %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x7e,0xca]		; CHECK-NEXT: vpermt2d %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x7e,0xca]
; CHECK-NEXT: vpaddd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfe,0xc1]		; CHECK-NEXT: vpaddd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfe,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.maskz.vpermt2var.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask.vpermi2var.pd.128(<2 x double>, <2 x i64>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.vpermi2var.pd.128(<2 x double>, <2 x i64>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_vpermi2var_pd_128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_vpermi2var_pd_128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm1, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd9]		; CHECK-NEXT: vmovapd %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd9]
; CHECK-NEXT: vpermi2pd %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x77,0xda]		; CHECK-NEXT: vpermi2pd %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x77,0xda]
; CHECK-NEXT: vpermi2pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0x77,0xca]		; CHECK-NEXT: vpermi2pd %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0xfd,0x08,0x77,0xca]
; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc1]		; CHECK-NEXT: vaddpd %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.vpermi2var.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.vpermi2var.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.vpermi2var.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.vpermi2var.pd.128(<2 x double> %x0, <2 x i64> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.vpermi2var.pd.256(<4 x double>, <4 x i64>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.vpermi2var.pd.256(<4 x double>, <4 x i64>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_vpermi2var_pd_256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_vpermi2var_pd_256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm1, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd9]		; CHECK-NEXT: vmovapd %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd9]
; CHECK-NEXT: vpermi2pd %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x77,0xda]		; CHECK-NEXT: vpermi2pd %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x77,0xda]
; CHECK-NEXT: vpermi2pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0x77,0xca]		; CHECK-NEXT: vpermi2pd %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0xfd,0x28,0x77,0xca]
; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc1]		; CHECK-NEXT: vaddpd %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.vpermi2var.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.vpermi2var.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.vpermi2var.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.vpermi2var.pd.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.vpermi2var.ps.128(<4 x float>, <4 x i32>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.vpermi2var.ps.128(<4 x float>, <4 x i32>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_vpermi2var_ps_128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vpermi2var_ps_128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm1, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd9]		; CHECK-NEXT: vmovaps %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd9]
; CHECK-NEXT: vpermi2ps %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x77,0xda]		; CHECK-NEXT: vpermi2ps %xmm2, %xmm0, %xmm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x77,0xda]
; CHECK-NEXT: vpermi2ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x77,0xca]		; CHECK-NEXT: vpermi2ps %xmm2, %xmm0, %xmm1 ## encoding: [0x62,0xf2,0x7d,0x08,0x77,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vpermi2var.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vpermi2var.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.vpermi2var.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.vpermi2var.ps.128(<4 x float> %x0, <4 x i32> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

define <4 x float>@test_int_x86_avx512_mask_vpermi2var_ps_128_cast(<4 x float> %x0, <2 x i64> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vpermi2var_ps_128_cast(<4 x float> %x0, <2 x i64> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_ps_128_cast:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_ps_128_cast:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermi2ps %xmm2, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x77,0xca]		; CHECK-NEXT: vpermi2ps %xmm2, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x77,0xca]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%x1cast = bitcast <2 x i64> %x1 to <4 x i32>		%x1cast = bitcast <2 x i64> %x1 to <4 x i32>
%res = call <4 x float> @llvm.x86.avx512.mask.vpermi2var.ps.128(<4 x float> %x0, <4 x i32> %x1cast, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vpermi2var.ps.128(<4 x float> %x0, <4 x i32> %x1cast, <4 x float> %x2, i8 %x3)
ret <4 x float> %res		ret <4 x float> %res
}		}

declare <8 x float> @llvm.x86.avx512.mask.vpermi2var.ps.256(<8 x float>, <8 x i32>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.vpermi2var.ps.256(<8 x float>, <8 x i32>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_vpermi2var_ps_256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_vpermi2var_ps_256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vpermi2var_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm1, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd9]		; CHECK-NEXT: vmovaps %ymm1, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd9]
; CHECK-NEXT: vpermi2ps %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x77,0xda]		; CHECK-NEXT: vpermi2ps %ymm2, %ymm0, %ymm3 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x77,0xda]
; CHECK-NEXT: vpermi2ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x77,0xca]		; CHECK-NEXT: vpermi2ps %ymm2, %ymm0, %ymm1 ## encoding: [0x62,0xf2,0x7d,0x28,0x77,0xca]
; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vpermi2var.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.vpermi2var.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.vpermi2var.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.vpermi2var.ps.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pabs.q.128(<2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pabs.q.128(<2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pabs_q_128(<2 x i64> %x0, <2 x i64> %x1, i8 %x2) {		define <2 x i64>@test_int_x86_avx512_mask_pabs_q_128(<2 x i64> %x0, <2 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pabs_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pabs_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpabsq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x1f,0xc8]		; CHECK-NEXT: vpabsq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x1f,0xc8]
; CHECK-NEXT: vpabsq %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x1f,0xc0]		; CHECK-NEXT: vpabsq %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x1f,0xc0]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pabs.q.128(<2 x i64> %x0, <2 x i64> %x1, i8 %x2)		%res = call <2 x i64> @llvm.x86.avx512.mask.pabs.q.128(<2 x i64> %x0, <2 x i64> %x1, i8 %x2)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pabs.q.128(<2 x i64> %x0, <2 x i64> %x1, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pabs.q.128(<2 x i64> %x0, <2 x i64> %x1, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pabs.q.256(<4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pabs.q.256(<4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pabs_q_256(<4 x i64> %x0, <4 x i64> %x1, i8 %x2) {		define <4 x i64>@test_int_x86_avx512_mask_pabs_q_256(<4 x i64> %x0, <4 x i64> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pabs_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pabs_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpabsq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x1f,0xc8]		; CHECK-NEXT: vpabsq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x1f,0xc8]
; CHECK-NEXT: vpabsq %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x1f,0xc0]		; CHECK-NEXT: vpabsq %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x1f,0xc0]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pabs.q.256(<4 x i64> %x0, <4 x i64> %x1, i8 %x2)		%res = call <4 x i64> @llvm.x86.avx512.mask.pabs.q.256(<4 x i64> %x0, <4 x i64> %x1, i8 %x2)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pabs.q.256(<4 x i64> %x0, <4 x i64> %x1, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pabs.q.256(<4 x i64> %x0, <4 x i64> %x1, i8 -1)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pabs.d.128(<4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pabs.d.128(<4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pabs_d_128(<4 x i32> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pabs_d_128(<4 x i32> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pabs_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pabs_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpabsd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x1e,0xc8]		; CHECK-NEXT: vpabsd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x1e,0xc8]
; CHECK-NEXT: vpabsd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1e,0xc0]		; CHECK-NEXT: vpabsd %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1e,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pabs.d.128(<4 x i32> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.pabs.d.128(<4 x i32> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pabs.d.128(<4 x i32> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pabs.d.128(<4 x i32> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pabs.d.256(<8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pabs.d.256(<8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pabs_d_256(<8 x i32> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_pabs_d_256(<8 x i32> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pabs_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pabs_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpabsd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x1e,0xc8]		; CHECK-NEXT: vpabsd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x1e,0xc8]
; CHECK-NEXT: vpabsd %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x1e,0xc0]		; CHECK-NEXT: vpabsd %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x1e,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pabs.d.256(<8 x i32> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.pabs.d.256(<8 x i32> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pabs.d.256(<8 x i32> %x0, <8 x i32> %x1, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pabs.d.256(<8 x i32> %x0, <8 x i32> %x1, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask.scalef.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.scalef.pd.128(<2 x double>, <2 x double>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_scalef_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_scalef_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_scalef_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_scalef_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vscalefpd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x2c,0xd1]		; CHECK-NEXT: vscalefpd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x2c,0xd1]
; CHECK-NEXT: vscalefpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x2c,0xc1]		; CHECK-NEXT: vscalefpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x2c,0xc1]
; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.scalef.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.scalef.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.scalef.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.scalef.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.scalef.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.scalef.pd.256(<4 x double>, <4 x double>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_scalef_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_scalef_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_scalef_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_scalef_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vscalefpd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x2c,0xd1]		; CHECK-NEXT: vscalefpd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x2c,0xd1]
; CHECK-NEXT: vscalefpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x2c,0xc1]		; CHECK-NEXT: vscalefpd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x2c,0xc1]
; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.scalef.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.scalef.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.scalef.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.scalef.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.scalef.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.scalef.ps.128(<4 x float>, <4 x float>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_scalef_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_scalef_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_scalef_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_scalef_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vscalefps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x2c,0xd1]		; CHECK-NEXT: vscalefps %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x2c,0xd1]
; CHECK-NEXT: vscalefps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x2c,0xc1]		; CHECK-NEXT: vscalefps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x2c,0xc1]
; CHECK-NEXT: vaddps %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6c,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe8,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.scalef.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.scalef.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.scalef.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.scalef.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.scalef.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.scalef.ps.256(<8 x float>, <8 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_scalef_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_scalef_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_scalef_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_scalef_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vscalefps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x2c,0xd1]		; CHECK-NEXT: vscalefps %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x2c,0xd1]
; CHECK-NEXT: vscalefps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x2c,0xc1]		; CHECK-NEXT: vscalefps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x2c,0xc1]
; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.scalef.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.scalef.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.scalef.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.scalef.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <16 x i8> @llvm.x86.avx512.mask.pmov.qb.128(<2 x i64>, <16 x i8>, i8)		declare <16 x i8> @llvm.x86.avx512.mask.pmov.qb.128(<2 x i64>, <16 x i8>, i8)
▲ Show 20 Lines • Show All 408 Lines • ▼ Show 20 Lines

define <4 x i32>@test_int_x86_avx512_mask_pmov_qd_128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmov_qd_128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmov_qd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmov_qd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovqd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x35,0xc1]		; CHECK-NEXT: vpmovqd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x35,0xc1]
; CHECK-NEXT: vpmovqd %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x35,0xc2]		; CHECK-NEXT: vpmovqd %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x35,0xc2]
; CHECK-NEXT: vpmovqd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x35,0xc0]		; CHECK-NEXT: vpmovqd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x35,0xc0]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 -1)		%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.128(<2 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.128(<2 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res3 = add <4 x i32> %res0, %res1		%res3 = add <4 x i32> %res0, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}
Show All 16 Lines

define <4 x i32>@test_int_x86_avx512_mask_pmovs_qd_128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmovs_qd_128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovs_qd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovs_qd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsqd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x25,0xc1]		; CHECK-NEXT: vpmovsqd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x25,0xc1]
; CHECK-NEXT: vpmovsqd %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x25,0xc2]		; CHECK-NEXT: vpmovsqd %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x25,0xc2]
; CHECK-NEXT: vpmovsqd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x25,0xc0]		; CHECK-NEXT: vpmovsqd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x25,0xc0]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 -1)		%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.128(<2 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.128(<2 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res3 = add <4 x i32> %res0, %res1		%res3 = add <4 x i32> %res0, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}
Show All 16 Lines

define <4 x i32>@test_int_x86_avx512_mask_pmovus_qd_128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmovus_qd_128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovus_qd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovus_qd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovusqd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x15,0xc1]		; CHECK-NEXT: vpmovusqd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x09,0x15,0xc1]
; CHECK-NEXT: vpmovusqd %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x15,0xc2]		; CHECK-NEXT: vpmovusqd %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0x89,0x15,0xc2]
; CHECK-NEXT: vpmovusqd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x15,0xc0]		; CHECK-NEXT: vpmovusqd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x08,0x15,0xc0]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 -1)		%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.128(<2 x i64> %x0, <4 x i32> %x1, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.128(<2 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.128(<2 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res3 = add <4 x i32> %res0, %res1		%res3 = add <4 x i32> %res0, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}
Show All 16 Lines

define <4 x i32>@test_int_x86_avx512_mask_pmov_qd_256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmov_qd_256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmov_qd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmov_qd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovqd %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x35,0xc1]		; CHECK-NEXT: vpmovqd %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x35,0xc1]
; CHECK-NEXT: vpmovqd %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x35,0xc2]		; CHECK-NEXT: vpmovqd %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x35,0xc2]
; CHECK-NEXT: vpmovqd %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x35,0xc0]		; CHECK-NEXT: vpmovqd %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x35,0xc0]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 -1)		%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.256(<4 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmov.qd.256(<4 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res3 = add <4 x i32> %res0, %res1		%res3 = add <4 x i32> %res0, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}
Show All 16 Lines

define <4 x i32>@test_int_x86_avx512_mask_pmovs_qd_256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmovs_qd_256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovs_qd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovs_qd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovsqd %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x25,0xc1]		; CHECK-NEXT: vpmovsqd %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x25,0xc1]
; CHECK-NEXT: vpmovsqd %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x25,0xc2]		; CHECK-NEXT: vpmovsqd %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x25,0xc2]
; CHECK-NEXT: vpmovsqd %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x25,0xc0]		; CHECK-NEXT: vpmovsqd %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x25,0xc0]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 -1)		%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.256(<4 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovs.qd.256(<4 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res3 = add <4 x i32> %res0, %res1		%res3 = add <4 x i32> %res0, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}
Show All 16 Lines

define <4 x i32>@test_int_x86_avx512_mask_pmovus_qd_256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_pmovus_qd_256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_pmovus_qd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pmovus_qd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpmovusqd %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x15,0xc1]		; CHECK-NEXT: vpmovusqd %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7e,0x29,0x15,0xc1]
; CHECK-NEXT: vpmovusqd %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x15,0xc2]		; CHECK-NEXT: vpmovusqd %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf2,0x7e,0xa9,0x15,0xc2]
; CHECK-NEXT: vpmovusqd %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x15,0xc0]		; CHECK-NEXT: vpmovusqd %ymm0, %xmm0 ## encoding: [0x62,0xf2,0x7e,0x28,0x15,0xc0]
; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc1]		; CHECK-NEXT: vpaddd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc1]
; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xfe,0xc2]		; CHECK-NEXT: vpaddd %xmm2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xfe,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 -1)		%res0 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.256(<4 x i64> %x0, <4 x i32> %x1, i8 %x2)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.256(<4 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pmovus.qd.256(<4 x i64> %x0, <4 x i32> zeroinitializer, i8 %x2)
%res3 = add <4 x i32> %res0, %res1		%res3 = add <4 x i32> %res0, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}
▲ Show 20 Lines • Show All 422 Lines • ▼ Show 20 Lines

declare <4 x float> @llvm.x86.avx512.mask.cvtdq2ps.128(<4 x i32>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.cvtdq2ps.128(<4 x i32>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_cvt_dq2ps_128(<4 x i32> %x0, <4 x float> %x1, i8 %x2) {		define <4 x float>@test_int_x86_avx512_mask_cvt_dq2ps_128(<4 x i32> %x0, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_dq2ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_dq2ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtdq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5b,0xc8]		; CHECK-NEXT: vcvtdq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5b,0xc8]
; CHECK-NEXT: vcvtdq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5b,0xc0]		; CHECK-NEXT: vcvtdq2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5b,0xc0]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.cvtdq2ps.128(<4 x i32> %x0, <4 x float> %x1, i8 %x2)		%res = call <4 x float> @llvm.x86.avx512.mask.cvtdq2ps.128(<4 x i32> %x0, <4 x float> %x1, i8 %x2)
%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtdq2ps.128(<4 x i32> %x0, <4 x float> %x1, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtdq2ps.128(<4 x i32> %x0, <4 x float> %x1, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.cvtdq2ps.256(<8 x i32>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.cvtdq2ps.256(<8 x i32>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_cvt_dq2ps_256(<8 x i32> %x0, <8 x float> %x1, i8 %x2) {		define <8 x float>@test_int_x86_avx512_mask_cvt_dq2ps_256(<8 x i32> %x0, <8 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_dq2ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_dq2ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtdq2ps %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5b,0xc8]		; CHECK-NEXT: vcvtdq2ps %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5b,0xc8]
; CHECK-NEXT: vcvtdq2ps %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5b,0xc0]		; CHECK-NEXT: vcvtdq2ps %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5b,0xc0]
; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.cvtdq2ps.256(<8 x i32> %x0, <8 x float> %x1, i8 %x2)		%res = call <8 x float> @llvm.x86.avx512.mask.cvtdq2ps.256(<8 x i32> %x0, <8 x float> %x1, i8 %x2)
%res1 = call <8 x float> @llvm.x86.avx512.mask.cvtdq2ps.256(<8 x i32> %x0, <8 x float> %x1, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.cvtdq2ps.256(<8 x i32> %x0, <8 x float> %x1, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2dq_128(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2dq_128(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2dq_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2dq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0xe6,0xc8]		; CHECK-NEXT: vcvtpd2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0xe6,0xc8]
; CHECK-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0xe6,0xc0]		; CHECK-NEXT: vcvtpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0xe6,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2dq_128_zext(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2dq_128_zext(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2dq_128_zext:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2dq_128_zext:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0xe6,0xc8]		; CHECK-NEXT: vcvtpd2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0xe6,0xc8]
; CHECK-NEXT: vmovq %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xfe,0x08,0x7e,0xc9]		; CHECK-NEXT: vmovq %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7e,0xc9]
; CHECK-NEXT: ## xmm1 = xmm1[0],zero		; CHECK-NEXT: ## xmm1 = xmm1[0],zero
; CHECK-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0xe6,0xc0]		; CHECK-NEXT: vcvtpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0xe6,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = shufflevector <4 x i32> %res, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res1 = shufflevector <4 x i32> %res, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res2 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)
%res3 = shufflevector <4 x i32> %res2, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res3 = shufflevector <4 x i32> %res2, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res4 = add <4 x i32> %res1, %res3		%res4 = add <4 x i32> %res1, %res3
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.256(<4 x double>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.256(<4 x double>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2dq_256(<4 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2dq_256(<4 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2dq_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2dq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2dq %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0xe6,0xc8]		; CHECK-NEXT: vcvtpd2dq %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x29,0xe6,0xc8]
; CHECK-NEXT: vcvtpd2dq %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x28,0xe6,0xc0]		; CHECK-NEXT: vcvtpd2dq %ymm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xff,0xe6,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.256(<4 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.256(<4 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.256(<4 x double> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2dq.256(<4 x double> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.cvtpd2ps.256(<4 x double>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.cvtpd2ps.256(<4 x double>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_cvt_pd2ps_256(<4 x double> %x0, <4 x float> %x1, i8 %x2) {		define <4 x float>@test_int_x86_avx512_mask_cvt_pd2ps_256(<4 x double> %x0, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2ps %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x5a,0xc8]		; CHECK-NEXT: vcvtpd2ps %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0x5a,0xc8]
; CHECK-NEXT: vcvtpd2ps %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x5a,0xc0]		; CHECK-NEXT: vcvtpd2ps %ymm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x5a,0xc0]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps.256(<4 x double> %x0, <4 x float> %x1, i8 %x2)		%res = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps.256(<4 x double> %x0, <4 x float> %x1, i8 %x2)
%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps.256(<4 x double> %x0, <4 x float> %x1, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps.256(<4 x double> %x0, <4 x float> %x1, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_cvt_pd2ps(<2 x double> %x0, <4 x float> %x1, i8 %x2) {		define <4 x float>@test_int_x86_avx512_mask_cvt_pd2ps(<2 x double> %x0, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2ps:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2ps:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x5a,0xc8]		; CHECK-NEXT: vcvtpd2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x5a,0xc8]
; CHECK-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5a,0xc0]		; CHECK-NEXT: vcvtpd2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5a,0xc0]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double> %x0, <4 x float> %x1, i8 %x2)		%res = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double> %x0, <4 x float> %x1, i8 %x2)
%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double> %x0, <4 x float> %x1, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double> %x0, <4 x float> %x1, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

define <4 x float>@test_int_x86_avx512_mask_cvt_pd2ps_zext(<2 x double> %x0, <4 x float> %x1, i8 %x2) {		define <4 x float>@test_int_x86_avx512_mask_cvt_pd2ps_zext(<2 x double> %x0, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2ps_zext:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2ps_zext:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x5a,0xc8]		; CHECK-NEXT: vcvtpd2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0x5a,0xc8]
; CHECK-NEXT: vmovq %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xfe,0x08,0x7e,0xc9]		; CHECK-NEXT: vmovq %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7e,0xc9]
; CHECK-NEXT: ## xmm1 = xmm1[0],zero		; CHECK-NEXT: ## xmm1 = xmm1[0],zero
; CHECK-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5a,0xc0]		; CHECK-NEXT: vcvtpd2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5a,0xc0]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double> %x0, <4 x float> %x1, i8 %x2)		%res = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double> %x0, <4 x float> %x1, i8 %x2)
%res1 = shufflevector <4 x float> %res, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res1 = shufflevector <4 x float> %res, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res2 = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double> %x0, <4 x float> %x1, i8 -1)		%res2 = call <4 x float> @llvm.x86.avx512.mask.cvtpd2ps(<2 x double> %x0, <4 x float> %x1, i8 -1)
%res3 = shufflevector <4 x float> %res2, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res3 = shufflevector <4 x float> %res2, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res4 = fadd <4 x float> %res1, %res3		%res4 = fadd <4 x float> %res1, %res3
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2udq_128(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2udq_128(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2udq_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2udq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x79,0xc8]		; CHECK-NEXT: vcvtpd2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x79,0xc8]
; CHECK-NEXT: vcvtpd2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x79,0xc0]		; CHECK-NEXT: vcvtpd2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x79,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2udq_128_zext(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2udq_128_zext(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2udq_128_zext:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2udq_128_zext:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x79,0xc8]		; CHECK-NEXT: vcvtpd2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x79,0xc8]
; CHECK-NEXT: vmovq %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xfe,0x08,0x7e,0xc9]		; CHECK-NEXT: vmovq %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7e,0xc9]
; CHECK-NEXT: ## xmm1 = xmm1[0],zero		; CHECK-NEXT: ## xmm1 = xmm1[0],zero
; CHECK-NEXT: vcvtpd2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x79,0xc0]		; CHECK-NEXT: vcvtpd2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x79,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = shufflevector <4 x i32> %res, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res1 = shufflevector <4 x i32> %res, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res2 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)
%res3 = shufflevector <4 x i32> %res2, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res3 = shufflevector <4 x i32> %res2, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res4 = add <4 x i32> %res1, %res3		%res4 = add <4 x i32> %res1, %res3
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.256(<4 x double>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.256(<4 x double>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2udq_256(<4 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvt_pd2udq_256(<4 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2udq_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_pd2udq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtpd2udq %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x29,0x79,0xc8]		; CHECK-NEXT: vcvtpd2udq %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x29,0x79,0xc8]
; CHECK-NEXT: vcvtpd2udq %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x28,0x79,0xc0]		; CHECK-NEXT: vcvtpd2udq %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x28,0x79,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.256(<4 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.256(<4 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.256(<4 x double> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtpd2udq.256(<4 x double> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvtps2dq.128(<4 x float>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvtps2dq.128(<4 x float>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvt_ps2dq_128(<4 x float> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvt_ps2dq_128(<4 x float> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2dq_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2dq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtps2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x5b,0xc8]		; CHECK-NEXT: vcvtps2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x09,0x5b,0xc8]
; CHECK-NEXT: vcvtps2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x5b,0xc0]		; CHECK-NEXT: vcvtps2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5b,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvtps2dq.128(<4 x float> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvtps2dq.128(<4 x float> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtps2dq.128(<4 x float> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtps2dq.128(<4 x float> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.cvtps2dq.256(<8 x float>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.cvtps2dq.256(<8 x float>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_cvt_ps2dq_256(<8 x float> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_cvt_ps2dq_256(<8 x float> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2dq_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2dq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtps2dq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x5b,0xc8]		; CHECK-NEXT: vcvtps2dq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7d,0x29,0x5b,0xc8]
; CHECK-NEXT: vcvtps2dq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x5b,0xc0]		; CHECK-NEXT: vcvtps2dq %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x5b,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.cvtps2dq.256(<8 x float> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.cvtps2dq.256(<8 x float> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.cvtps2dq.256(<8 x float> %x0, <8 x i32> %x1, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.cvtps2dq.256(<8 x float> %x0, <8 x i32> %x1, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask.cvtps2pd.128(<4 x float>, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.cvtps2pd.128(<4 x float>, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_cvt_ps2pd_128(<4 x float> %x0, <2 x double> %x1, i8 %x2) {		define <2 x double>@test_int_x86_avx512_mask_cvt_ps2pd_128(<4 x float> %x0, <2 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtps2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5a,0xc8]		; CHECK-NEXT: vcvtps2pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x5a,0xc8]
; CHECK-NEXT: vcvtps2pd %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5a,0xc0]		; CHECK-NEXT: vcvtps2pd %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5a,0xc0]
; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.cvtps2pd.128(<4 x float> %x0, <2 x double> %x1, i8 %x2)		%res = call <2 x double> @llvm.x86.avx512.mask.cvtps2pd.128(<4 x float> %x0, <2 x double> %x1, i8 %x2)
%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtps2pd.128(<4 x float> %x0, <2 x double> %x1, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.cvtps2pd.128(<4 x float> %x0, <2 x double> %x1, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.cvtps2pd.256(<4 x float>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.cvtps2pd.256(<4 x float>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_cvt_ps2pd_256(<4 x float> %x0, <4 x double> %x1, i8 %x2) {		define <4 x double>@test_int_x86_avx512_mask_cvt_ps2pd_256(<4 x float> %x0, <4 x double> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtps2pd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5a,0xc8]		; CHECK-NEXT: vcvtps2pd %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x5a,0xc8]
; CHECK-NEXT: vcvtps2pd %xmm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x5a,0xc0]		; CHECK-NEXT: vcvtps2pd %xmm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5a,0xc0]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.cvtps2pd.256(<4 x float> %x0, <4 x double> %x1, i8 %x2)		%res = call <4 x double> @llvm.x86.avx512.mask.cvtps2pd.256(<4 x float> %x0, <4 x double> %x1, i8 %x2)
%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtps2pd.256(<4 x float> %x0, <4 x double> %x1, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.cvtps2pd.256(<4 x float> %x0, <4 x double> %x1, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvtps2udq.128(<4 x float>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvtps2udq.128(<4 x float>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvt_ps2udq_128(<4 x float> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvt_ps2udq_128(<4 x float> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2udq_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2udq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtps2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x79,0xc8]		; CHECK-NEXT: vcvtps2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x79,0xc8]
; CHECK-NEXT: vcvtps2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x79,0xc0]		; CHECK-NEXT: vcvtps2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x79,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvtps2udq.128(<4 x float> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvtps2udq.128(<4 x float> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtps2udq.128(<4 x float> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvtps2udq.128(<4 x float> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.cvtps2udq.256(<8 x float>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.cvtps2udq.256(<8 x float>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_cvt_ps2udq_256(<8 x float> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_cvt_ps2udq_256(<8 x float> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2udq_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_ps2udq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtps2udq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x79,0xc8]		; CHECK-NEXT: vcvtps2udq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x79,0xc8]
; CHECK-NEXT: vcvtps2udq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x79,0xc0]		; CHECK-NEXT: vcvtps2udq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x79,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.cvtps2udq.256(<8 x float> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.cvtps2udq.256(<8 x float> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.cvtps2udq.256(<8 x float> %x0, <8 x i32> %x1, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.cvtps2udq.256(<8 x float> %x0, <8 x i32> %x1, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2dq_128(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2dq_128(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2dq_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2dq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttpd2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xe6,0xc8]		; CHECK-NEXT: vcvttpd2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xe6,0xc8]
; CHECK-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xe6,0xc0]		; CHECK-NEXT: vcvttpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe6,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2dq_128_zext(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2dq_128_zext(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2dq_128_zext:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2dq_128_zext:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttpd2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xe6,0xc8]		; CHECK-NEXT: vcvttpd2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xe6,0xc8]
; CHECK-NEXT: vmovq %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xfe,0x08,0x7e,0xc9]		; CHECK-NEXT: vmovq %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7e,0xc9]
; CHECK-NEXT: ## xmm1 = xmm1[0],zero		; CHECK-NEXT: ## xmm1 = xmm1[0],zero
; CHECK-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xe6,0xc0]		; CHECK-NEXT: vcvttpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe6,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = shufflevector <4 x i32> %res, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res1 = shufflevector <4 x i32> %res, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res2 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)
%res3 = shufflevector <4 x i32> %res2, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res3 = shufflevector <4 x i32> %res2, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res4 = add <4 x i32> %res1, %res3		%res4 = add <4 x i32> %res1, %res3
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.256(<4 x double>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.256(<4 x double>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2dq_256(<4 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2dq_256(<4 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2dq_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2dq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttpd2dq %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xe6,0xc8]		; CHECK-NEXT: vcvttpd2dq %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xe6,0xc8]
; CHECK-NEXT: vcvttpd2dq %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x28,0xe6,0xc0]		; CHECK-NEXT: vcvttpd2dq %ymm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xe6,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.256(<4 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.256(<4 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.256(<4 x double> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2dq.256(<4 x double> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2udq_128(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2udq_128(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2udq_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2udq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttpd2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x78,0xc8]		; CHECK-NEXT: vcvttpd2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x78,0xc8]
; CHECK-NEXT: vcvttpd2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x78,0xc0]		; CHECK-NEXT: vcvttpd2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x78,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2udq_128_zext(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2udq_128_zext(<2 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2udq_128_zext:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2udq_128_zext:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttpd2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x78,0xc8]		; CHECK-NEXT: vcvttpd2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x09,0x78,0xc8]
; CHECK-NEXT: vmovq %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xfe,0x08,0x7e,0xc9]		; CHECK-NEXT: vmovq %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x7e,0xc9]
; CHECK-NEXT: ## xmm1 = xmm1[0],zero		; CHECK-NEXT: ## xmm1 = xmm1[0],zero
; CHECK-NEXT: vcvttpd2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x78,0xc0]		; CHECK-NEXT: vcvttpd2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x08,0x78,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = shufflevector <4 x i32> %res, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res1 = shufflevector <4 x i32> %res, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res2 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.128(<2 x double> %x0, <4 x i32> %x1, i8 -1)
%res3 = shufflevector <4 x i32> %res2, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>		%res3 = shufflevector <4 x i32> %res2, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
%res4 = add <4 x i32> %res1, %res3		%res4 = add <4 x i32> %res1, %res3
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.256(<4 x double>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.256(<4 x double>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2udq_256(<4 x double> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvtt_pd2udq_256(<4 x double> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2udq_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_pd2udq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttpd2udq %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x29,0x78,0xc8]		; CHECK-NEXT: vcvttpd2udq %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xfc,0x29,0x78,0xc8]
; CHECK-NEXT: vcvttpd2udq %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x28,0x78,0xc0]		; CHECK-NEXT: vcvttpd2udq %ymm0, %xmm0 ## encoding: [0x62,0xf1,0xfc,0x28,0x78,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.256(<4 x double> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.256(<4 x double> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.256(<4 x double> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttpd2udq.256(<4 x double> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvttps2dq.128(<4 x float>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvttps2dq.128(<4 x float>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvtt_ps2dq_128(<4 x float> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvtt_ps2dq_128(<4 x float> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2dq_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2dq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttps2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x5b,0xc8]		; CHECK-NEXT: vcvttps2dq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x5b,0xc8]
; CHECK-NEXT: vcvttps2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x5b,0xc0]		; CHECK-NEXT: vcvttps2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x5b,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvttps2dq.128(<4 x float> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvttps2dq.128(<4 x float> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttps2dq.128(<4 x float> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttps2dq.128(<4 x float> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.cvttps2dq.256(<8 x float>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.cvttps2dq.256(<8 x float>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_cvtt_ps2dq_256(<8 x float> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_cvtt_ps2dq_256(<8 x float> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2dq_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2dq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttps2dq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x5b,0xc8]		; CHECK-NEXT: vcvttps2dq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x29,0x5b,0xc8]
; CHECK-NEXT: vcvttps2dq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7e,0x28,0x5b,0xc0]		; CHECK-NEXT: vcvttps2dq %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfe,0x5b,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.cvttps2dq.256(<8 x float> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.cvttps2dq.256(<8 x float> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.cvttps2dq.256(<8 x float> %x0, <8 x i32> %x1, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.cvttps2dq.256(<8 x float> %x0, <8 x i32> %x1, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.mask.cvttps2udq.128(<4 x float>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.cvttps2udq.128(<4 x float>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_cvtt_ps2udq_128(<4 x float> %x0, <4 x i32> %x1, i8 %x2) {		define <4 x i32>@test_int_x86_avx512_mask_cvtt_ps2udq_128(<4 x float> %x0, <4 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2udq_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2udq_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttps2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x78,0xc8]		; CHECK-NEXT: vcvttps2udq %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x09,0x78,0xc8]
; CHECK-NEXT: vcvttps2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x78,0xc0]		; CHECK-NEXT: vcvttps2udq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x78,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.cvttps2udq.128(<4 x float> %x0, <4 x i32> %x1, i8 %x2)		%res = call <4 x i32> @llvm.x86.avx512.mask.cvttps2udq.128(<4 x float> %x0, <4 x i32> %x1, i8 %x2)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttps2udq.128(<4 x float> %x0, <4 x i32> %x1, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.cvttps2udq.128(<4 x float> %x0, <4 x i32> %x1, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.cvttps2udq.256(<8 x float>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.cvttps2udq.256(<8 x float>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_cvtt_ps2udq_256(<8 x float> %x0, <8 x i32> %x1, i8 %x2) {		define <8 x i32>@test_int_x86_avx512_mask_cvtt_ps2udq_256(<8 x float> %x0, <8 x i32> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2udq_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvtt_ps2udq_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvttps2udq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x78,0xc8]		; CHECK-NEXT: vcvttps2udq %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0x78,0xc8]
; CHECK-NEXT: vcvttps2udq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x78,0xc0]		; CHECK-NEXT: vcvttps2udq %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x78,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.cvttps2udq.256(<8 x float> %x0, <8 x i32> %x1, i8 %x2)		%res = call <8 x i32> @llvm.x86.avx512.mask.cvttps2udq.256(<8 x float> %x0, <8 x i32> %x1, i8 %x2)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.cvttps2udq.256(<8 x float> %x0, <8 x i32> %x1, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.cvttps2udq.256(<8 x float> %x0, <8 x i32> %x1, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.cvtudq2ps.128(<4 x i32>, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.cvtudq2ps.128(<4 x i32>, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_cvt_udq2ps_128(<4 x i32> %x0, <4 x float> %x1, i8 %x2) {		define <4 x float>@test_int_x86_avx512_mask_cvt_udq2ps_128(<4 x i32> %x0, <4 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_udq2ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_udq2ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtudq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7f,0x09,0x7a,0xc8]		; CHECK-NEXT: vcvtudq2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7f,0x09,0x7a,0xc8]
; CHECK-NEXT: vcvtudq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x7a,0xc0]		; CHECK-NEXT: vcvtudq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x7a,0xc0]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.cvtudq2ps.128(<4 x i32> %x0, <4 x float> %x1, i8 %x2)		%res = call <4 x float> @llvm.x86.avx512.mask.cvtudq2ps.128(<4 x i32> %x0, <4 x float> %x1, i8 %x2)
%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtudq2ps.128(<4 x i32> %x0, <4 x float> %x1, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.cvtudq2ps.128(<4 x i32> %x0, <4 x float> %x1, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.cvtudq2ps.256(<8 x i32>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.cvtudq2ps.256(<8 x i32>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_cvt_udq2ps_256(<8 x i32> %x0, <8 x float> %x1, i8 %x2) {		define <8 x float>@test_int_x86_avx512_mask_cvt_udq2ps_256(<8 x i32> %x0, <8 x float> %x1, i8 %x2) {
; CHECK-LABEL: test_int_x86_avx512_mask_cvt_udq2ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_cvt_udq2ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtudq2ps %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7f,0x29,0x7a,0xc8]		; CHECK-NEXT: vcvtudq2ps %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x7f,0x29,0x7a,0xc8]
; CHECK-NEXT: vcvtudq2ps %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7f,0x28,0x7a,0xc0]		; CHECK-NEXT: vcvtudq2ps %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7f,0x28,0x7a,0xc0]
; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.cvtudq2ps.256(<8 x i32> %x0, <8 x float> %x1, i8 %x2)		%res = call <8 x float> @llvm.x86.avx512.mask.cvtudq2ps.256(<8 x i32> %x0, <8 x float> %x1, i8 %x2)
%res1 = call <8 x float> @llvm.x86.avx512.mask.cvtudq2ps.256(<8 x i32> %x0, <8 x float> %x1, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.cvtudq2ps.256(<8 x i32> %x0, <8 x float> %x1, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128(<2 x double>, i32, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128(<2 x double>, i32, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_rndscale_pd_128(<2 x double> %x0, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_rndscale_pd_128(<2 x double> %x0, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_rndscale_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_rndscale_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrndscalepd $4, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x09,0xc8,0x04]		; CHECK-NEXT: vrndscalepd $4, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x09,0xc8,0x04]
; CHECK-NEXT: vrndscalepd $88, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x08,0x09,0xc0,0x58]		; CHECK-NEXT: vrndscalepd $88, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x08,0x09,0xc0,0x58]
; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128(<2 x double> %x0, i32 4, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128(<2 x double> %x0, i32 4, <2 x double> %x2, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128(<2 x double> %x0, i32 88, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128(<2 x double> %x0, i32 88, <2 x double> %x2, i8 -1)
%res2 = fadd <2 x double> %res, %res1		%res2 = fadd <2 x double> %res, %res1
ret <2 x double> %res2		ret <2 x double> %res2
}		}

declare <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256(<4 x double>, i32, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256(<4 x double>, i32, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_rndscale_pd_256(<4 x double> %x0, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_rndscale_pd_256(<4 x double> %x0, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_rndscale_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_rndscale_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrndscalepd $4, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x09,0xc8,0x04]		; CHECK-NEXT: vrndscalepd $4, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x09,0xc8,0x04]
; CHECK-NEXT: vrndscalepd $88, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x09,0xc0,0x58]		; CHECK-NEXT: vrndscalepd $88, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x09,0xc0,0x58]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256(<4 x double> %x0, i32 4, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256(<4 x double> %x0, i32 4, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256(<4 x double> %x0, i32 88, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256(<4 x double> %x0, i32 88, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128(<4 x float>, i32, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128(<4 x float>, i32, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_rndscale_ps_128(<4 x float> %x0, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_rndscale_ps_128(<4 x float> %x0, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_rndscale_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_rndscale_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrndscaleps $88, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x08,0xc8,0x58]		; CHECK-NEXT: vrndscaleps $88, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x08,0xc8,0x58]
; CHECK-NEXT: vrndscaleps $4, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x08,0xc0,0x04]		; CHECK-NEXT: vrndscaleps $4, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x08,0xc0,0x04]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128(<4 x float> %x0, i32 88, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128(<4 x float> %x0, i32 88, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128(<4 x float> %x0, i32 4, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128(<4 x float> %x0, i32 4, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256(<8 x float>, i32, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256(<8 x float>, i32, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_rndscale_ps_256(<8 x float> %x0, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_rndscale_ps_256(<8 x float> %x0, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_rndscale_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_rndscale_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrndscaleps $5, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x08,0xc8,0x05]		; CHECK-NEXT: vrndscaleps $5, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x08,0xc8,0x05]
; CHECK-NEXT: vrndscaleps $66, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x08,0xc0,0x42]		; CHECK-NEXT: vrndscaleps $66, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x08,0xc0,0x42]
; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256(<8 x float> %x0, i32 5, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256(<8 x float> %x0, i32 5, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256(<8 x float> %x0, i32 66, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256(<8 x float> %x0, i32 66, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.shuf.f32x4.256(<8 x float>, <8 x float>, i32, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.shuf.f32x4.256(<8 x float>, <8 x float>, i32, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_shuf_f32x4_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x3, i8 %x4) {		define <8 x float>@test_int_x86_avx512_mask_shuf_f32x4_256(<8 x float> %x0, <8 x float> %x1, <8 x float> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_shuf_f32x4_256:		; CHECK-LABEL: test_int_x86_avx512_mask_shuf_f32x4_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshuff32x4 $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x23,0xd1,0x16]		; CHECK-NEXT: vshuff32x4 $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x23,0xd1,0x16]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0,1,2,3],ymm1[4,5,6,7]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0,1,2,3],ymm1[4,5,6,7]
; CHECK-NEXT: vshuff32x4 $22, %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x23,0xd9,0x16]		; CHECK-NEXT: vshuff32x4 $22, %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x23,0xd9,0x16]
; CHECK-NEXT: ## ymm3 {%k1} {z} = ymm0[0,1,2,3],ymm1[4,5,6,7]		; CHECK-NEXT: ## ymm3 {%k1} {z} = ymm0[0,1,2,3],ymm1[4,5,6,7]
; CHECK-NEXT: vshuff32x4 $22, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x23,0xc1,0x16]		; CHECK-NEXT: vshuff32x4 $22, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x23,0xc1,0x16]
; CHECK-NEXT: ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]		; CHECK-NEXT: ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]
; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc0]
; CHECK-NEXT: vaddps %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.shuf.f32x4.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> %x3, i8 %x4)		%res = call <8 x float> @llvm.x86.avx512.mask.shuf.f32x4.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> %x3, i8 %x4)
%res1 = call <8 x float> @llvm.x86.avx512.mask.shuf.f32x4.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> %x3, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.shuf.f32x4.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> %x3, i8 -1)
%res2 = call <8 x float> @llvm.x86.avx512.mask.shuf.f32x4.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> zeroinitializer, i8 %x4)		%res2 = call <8 x float> @llvm.x86.avx512.mask.shuf.f32x4.256(<8 x float> %x0, <8 x float> %x1, i32 22, <8 x float> zeroinitializer, i8 %x4)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res2, %res3		%res4 = fadd <8 x float> %res2, %res3
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.shuf.f64x2.256(<4 x double>, <4 x double>, i32, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.shuf.f64x2.256(<4 x double>, <4 x double>, i32, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_shuf_f64x2_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x3, i8 %x4) {		define <4 x double>@test_int_x86_avx512_mask_shuf_f64x2_256(<4 x double> %x0, <4 x double> %x1, <4 x double> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_shuf_f64x2_256:		; CHECK-LABEL: test_int_x86_avx512_mask_shuf_f64x2_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshuff64x2 $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x23,0xd1,0x16]		; CHECK-NEXT: vshuff64x2 $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x23,0xd1,0x16]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0,1],ymm1[2,3]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0,1],ymm1[2,3]
; CHECK-NEXT: vshuff64x2 $22, %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x23,0xd9,0x16]		; CHECK-NEXT: vshuff64x2 $22, %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0xa9,0x23,0xd9,0x16]
; CHECK-NEXT: ## ymm3 {%k1} {z} = ymm0[0,1],ymm1[2,3]		; CHECK-NEXT: ## ymm3 {%k1} {z} = ymm0[0,1],ymm1[2,3]
; CHECK-NEXT: vshuff64x2 $22, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x23,0xc1,0x16]		; CHECK-NEXT: vshuff64x2 $22, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x23,0xc1,0x16]
; CHECK-NEXT: ## ymm0 = ymm0[0,1],ymm1[2,3]		; CHECK-NEXT: ## ymm0 = ymm0[0,1],ymm1[2,3]
; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xc0]
; CHECK-NEXT: vaddpd %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.shuf.f64x2.256(<4 x double> %x0, <4 x double> %x1, i32 22, <4 x double> %x3, i8 %x4)		%res = call <4 x double> @llvm.x86.avx512.mask.shuf.f64x2.256(<4 x double> %x0, <4 x double> %x1, i32 22, <4 x double> %x3, i8 %x4)
%res1 = call <4 x double> @llvm.x86.avx512.mask.shuf.f64x2.256(<4 x double> %x0, <4 x double> %x1, i32 22, <4 x double> %x3, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.shuf.f64x2.256(<4 x double> %x0, <4 x double> %x1, i32 22, <4 x double> %x3, i8 -1)
%res2 = call <4 x double> @llvm.x86.avx512.mask.shuf.f64x2.256(<4 x double> %x0, <4 x double> %x1, i32 22, <4 x double> zeroinitializer, i8 %x4)		%res2 = call <4 x double> @llvm.x86.avx512.mask.shuf.f64x2.256(<4 x double> %x0, <4 x double> %x1, i32 22, <4 x double> zeroinitializer, i8 %x4)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res2, %res3		%res4 = fadd <4 x double> %res2, %res3
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.shuf.i32x4.256(<8 x i32>, <8 x i32>, i32, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.shuf.i32x4.256(<8 x i32>, <8 x i32>, i32, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_shuf_i32x4_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x3, i8 %x4) {		define <8 x i32>@test_int_x86_avx512_mask_shuf_i32x4_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_shuf_i32x4_256:		; CHECK-LABEL: test_int_x86_avx512_mask_shuf_i32x4_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshufi32x4 $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x43,0xd1,0x16]		; CHECK-NEXT: vshufi32x4 $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x43,0xd1,0x16]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0,1,2,3],ymm1[4,5,6,7]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0,1,2,3],ymm1[4,5,6,7]
; CHECK-NEXT: vshufi32x4 $22, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x43,0xc1,0x16]		; CHECK-NEXT: vshufi32x4 $22, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x43,0xc1,0x16]
; CHECK-NEXT: ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]		; CHECK-NEXT: ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.shuf.i32x4.256(<8 x i32> %x0, <8 x i32> %x1, i32 22, <8 x i32> %x3, i8 %x4)		%res = call <8 x i32> @llvm.x86.avx512.mask.shuf.i32x4.256(<8 x i32> %x0, <8 x i32> %x1, i32 22, <8 x i32> %x3, i8 %x4)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.shuf.i32x4.256(<8 x i32> %x0, <8 x i32> %x1, i32 22, <8 x i32> %x3, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.shuf.i32x4.256(<8 x i32> %x0, <8 x i32> %x1, i32 22, <8 x i32> %x3, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.shuf.i64x2.256(<4 x i64>, <4 x i64>, i32, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.shuf.i64x2.256(<4 x i64>, <4 x i64>, i32, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_shuf_i64x2_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x3, i8 %x4) {		define <4 x i64>@test_int_x86_avx512_mask_shuf_i64x2_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_shuf_i64x2_256:		; CHECK-LABEL: test_int_x86_avx512_mask_shuf_i64x2_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshufi64x2 $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x43,0xd1,0x16]		; CHECK-NEXT: vshufi64x2 $22, %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x43,0xd1,0x16]
; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0,1],ymm1[2,3]		; CHECK-NEXT: ## ymm2 {%k1} = ymm0[0,1],ymm1[2,3]
; CHECK-NEXT: vshufi64x2 $22, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x43,0xc1,0x16]		; CHECK-NEXT: vshufi64x2 $22, %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x43,0xc1,0x16]
; CHECK-NEXT: ## ymm0 = ymm0[0,1],ymm1[2,3]		; CHECK-NEXT: ## ymm0 = ymm0[0,1],ymm1[2,3]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.shuf.i64x2.256(<4 x i64> %x0, <4 x i64> %x1, i32 22, <4 x i64> %x3, i8 %x4)		%res = call <4 x i64> @llvm.x86.avx512.mask.shuf.i64x2.256(<4 x i64> %x0, <4 x i64> %x1, i32 22, <4 x i64> %x3, i8 %x4)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.shuf.i64x2.256(<4 x i64> %x0, <4 x i64> %x1, i32 22, <4 x i64> %x3, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.shuf.i64x2.256(<4 x i64> %x0, <4 x i64> %x1, i32 22, <4 x i64> %x3, i8 -1)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.vextractf32x4.256(<8 x float>, i32, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.vextractf32x4.256(<8 x float>, i32, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_vextractf32x4_256(<8 x float> %x0, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_vextractf32x4_256(<8 x float> %x0, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_vextractf32x4_256:		; CHECK-LABEL: test_int_x86_avx512_mask_vextractf32x4_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vextractf32x4 $1, %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x19,0xc1,0x01]		; CHECK-NEXT: vextractf32x4 $1, %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x19,0xc1,0x01]
; CHECK-NEXT: vextractf32x4 $1, %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x19,0xc2,0x01]		; CHECK-NEXT: vextractf32x4 $1, %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x19,0xc2,0x01]
; CHECK-NEXT: vextractf32x4 $1, %ymm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x19,0xc0,0x01]		; CHECK-NEXT: vextractf32x4 $1, %ymm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x19,0xc0,0x01]
; CHECK-NEXT: vaddps %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xca]		; CHECK-NEXT: vaddps %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xca]
; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x58,0xc1]		; CHECK-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x58,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vextractf32x4.256(<8 x float> %x0, i32 1, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.vextractf32x4.256(<8 x float> %x0, i32 1, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.vextractf32x4.256(<8 x float> %x0, i32 1, <4 x float> zeroinitializer, i8 %x3)		%res1 = call <4 x float> @llvm.x86.avx512.mask.vextractf32x4.256(<8 x float> %x0, i32 1, <4 x float> zeroinitializer, i8 %x3)
%res2 = call <4 x float> @llvm.x86.avx512.mask.vextractf32x4.256(<8 x float> %x0, i32 1, <4 x float> zeroinitializer, i8 -1)		%res2 = call <4 x float> @llvm.x86.avx512.mask.vextractf32x4.256(<8 x float> %x0, i32 1, <4 x float> zeroinitializer, i8 -1)
%res3 = fadd <4 x float> %res, %res1		%res3 = fadd <4 x float> %res, %res1
%res4 = fadd <4 x float> %res2, %res3		%res4 = fadd <4 x float> %res2, %res3
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <2 x double> @llvm.x86.avx512.mask.getmant.pd.128(<2 x double>, i32, <2 x double>, i8)		declare <2 x double> @llvm.x86.avx512.mask.getmant.pd.128(<2 x double>, i32, <2 x double>, i8)

define <2 x double>@test_int_x86_avx512_mask_getmant_pd_128(<2 x double> %x0, <2 x double> %x2, i8 %x3) {		define <2 x double>@test_int_x86_avx512_mask_getmant_pd_128(<2 x double> %x0, <2 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_getmant_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_getmant_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vgetmantpd $11, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x26,0xc8,0x0b]		; CHECK-NEXT: vgetmantpd $11, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x09,0x26,0xc8,0x0b]
; CHECK-NEXT: vgetmantpd $11, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0x89,0x26,0xd0,0x0b]		; CHECK-NEXT: vgetmantpd $11, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0xfd,0x89,0x26,0xd0,0x0b]
; CHECK-NEXT: vgetmantpd $11, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x08,0x26,0xc0,0x0b]		; CHECK-NEXT: vgetmantpd $11, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0xfd,0x08,0x26,0xc0,0x0b]
; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.getmant.pd.128(<2 x double> %x0, i32 11, <2 x double> %x2, i8 %x3)		%res = call <2 x double> @llvm.x86.avx512.mask.getmant.pd.128(<2 x double> %x0, i32 11, <2 x double> %x2, i8 %x3)
%res2 = call <2 x double> @llvm.x86.avx512.mask.getmant.pd.128(<2 x double> %x0, i32 11, <2 x double> zeroinitializer, i8 %x3)		%res2 = call <2 x double> @llvm.x86.avx512.mask.getmant.pd.128(<2 x double> %x0, i32 11, <2 x double> zeroinitializer, i8 %x3)
%res1 = call <2 x double> @llvm.x86.avx512.mask.getmant.pd.128(<2 x double> %x0, i32 11, <2 x double> %x2, i8 -1)		%res1 = call <2 x double> @llvm.x86.avx512.mask.getmant.pd.128(<2 x double> %x0, i32 11, <2 x double> %x2, i8 -1)
%res3 = fadd <2 x double> %res, %res1		%res3 = fadd <2 x double> %res, %res1
%res4 = fadd <2 x double> %res2, %res3		%res4 = fadd <2 x double> %res2, %res3
ret <2 x double> %res4		ret <2 x double> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.getmant.pd.256(<4 x double>, i32, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.getmant.pd.256(<4 x double>, i32, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_getmant_pd_256(<4 x double> %x0, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_getmant_pd_256(<4 x double> %x0, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_getmant_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_getmant_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vgetmantpd $11, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x26,0xc8,0x0b]		; CHECK-NEXT: vgetmantpd $11, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0xfd,0x29,0x26,0xc8,0x0b]
; CHECK-NEXT: vgetmantpd $11, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x26,0xc0,0x0b]		; CHECK-NEXT: vgetmantpd $11, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0xfd,0x28,0x26,0xc0,0x0b]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.getmant.pd.256(<4 x double> %x0, i32 11, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.getmant.pd.256(<4 x double> %x0, i32 11, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.getmant.pd.256(<4 x double> %x0, i32 11, <4 x double> %x2, i8 -1)		%res1 = call <4 x double> @llvm.x86.avx512.mask.getmant.pd.256(<4 x double> %x0, i32 11, <4 x double> %x2, i8 -1)
%res2 = fadd <4 x double> %res, %res1		%res2 = fadd <4 x double> %res, %res1
ret <4 x double> %res2		ret <4 x double> %res2
}		}

declare <4 x float> @llvm.x86.avx512.mask.getmant.ps.128(<4 x float>, i32, <4 x float>, i8)		declare <4 x float> @llvm.x86.avx512.mask.getmant.ps.128(<4 x float>, i32, <4 x float>, i8)

define <4 x float>@test_int_x86_avx512_mask_getmant_ps_128(<4 x float> %x0, <4 x float> %x2, i8 %x3) {		define <4 x float>@test_int_x86_avx512_mask_getmant_ps_128(<4 x float> %x0, <4 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_getmant_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_getmant_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vgetmantps $11, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x26,0xc8,0x0b]		; CHECK-NEXT: vgetmantps $11, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x26,0xc8,0x0b]
; CHECK-NEXT: vgetmantps $11, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x26,0xc0,0x0b]		; CHECK-NEXT: vgetmantps $11, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x26,0xc0,0x0b]
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x74,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf0,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.getmant.ps.128(<4 x float> %x0, i32 11, <4 x float> %x2, i8 %x3)		%res = call <4 x float> @llvm.x86.avx512.mask.getmant.ps.128(<4 x float> %x0, i32 11, <4 x float> %x2, i8 %x3)
%res1 = call <4 x float> @llvm.x86.avx512.mask.getmant.ps.128(<4 x float> %x0, i32 11, <4 x float> %x2, i8 -1)		%res1 = call <4 x float> @llvm.x86.avx512.mask.getmant.ps.128(<4 x float> %x0, i32 11, <4 x float> %x2, i8 -1)
%res2 = fadd <4 x float> %res, %res1		%res2 = fadd <4 x float> %res, %res1
ret <4 x float> %res2		ret <4 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.getmant.ps.256(<8 x float>, i32, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.getmant.ps.256(<8 x float>, i32, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_getmant_ps_256(<8 x float> %x0, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_getmant_ps_256(<8 x float> %x0, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_getmant_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_getmant_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vgetmantps $11, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x26,0xc8,0x0b]		; CHECK-NEXT: vgetmantps $11, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x26,0xc8,0x0b]
; CHECK-NEXT: vgetmantps $11, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x26,0xc0,0x0b]		; CHECK-NEXT: vgetmantps $11, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x26,0xc0,0x0b]
; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.getmant.ps.256(<8 x float> %x0, i32 11, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.getmant.ps.256(<8 x float> %x0, i32 11, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.getmant.ps.256(<8 x float> %x0, i32 11, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.getmant.ps.256(<8 x float> %x0, i32 11, <8 x float> %x2, i8 -1)
%res2 = fadd <8 x float> %res, %res1		%res2 = fadd <8 x float> %res, %res1
ret <8 x float> %res2		ret <8 x float> %res2
}		}

declare <8 x float> @llvm.x86.avx512.mask.insertf32x4.256(<8 x float>, <4 x float>, i32, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.insertf32x4.256(<8 x float>, <4 x float>, i32, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_insertf32x4_256(<8 x float> %x0, <4 x float> %x1, <8 x float> %x3, i8 %x4) {		define <8 x float>@test_int_x86_avx512_mask_insertf32x4_256(<8 x float> %x0, <4 x float> %x1, <8 x float> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_insertf32x4_256:		; CHECK-LABEL: test_int_x86_avx512_mask_insertf32x4_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vinsertf32x4 $1, %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x18,0xd1,0x01]		; CHECK-NEXT: vinsertf32x4 $1, %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x18,0xd1,0x01]
; CHECK-NEXT: vinsertf32x4 $1, %xmm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x18,0xd9,0x01]		; CHECK-NEXT: vinsertf32x4 $1, %xmm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x18,0xd9,0x01]
; CHECK-NEXT: vinsertf32x4 $1, %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x18,0xc1,0x01]		; CHECK-NEXT: vinsertf32x4 $1, %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x18,0xc1,0x01]
; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc0]
; CHECK-NEXT: vaddps %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.insertf32x4.256(<8 x float> %x0, <4 x float> %x1, i32 1, <8 x float> %x3, i8 %x4)		%res = call <8 x float> @llvm.x86.avx512.mask.insertf32x4.256(<8 x float> %x0, <4 x float> %x1, i32 1, <8 x float> %x3, i8 %x4)
%res1 = call <8 x float> @llvm.x86.avx512.mask.insertf32x4.256(<8 x float> %x0, <4 x float> %x1, i32 1, <8 x float> %x3, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.insertf32x4.256(<8 x float> %x0, <4 x float> %x1, i32 1, <8 x float> %x3, i8 -1)
%res2 = call <8 x float> @llvm.x86.avx512.mask.insertf32x4.256(<8 x float> %x0, <4 x float> %x1, i32 1, <8 x float> zeroinitializer, i8 %x4)		%res2 = call <8 x float> @llvm.x86.avx512.mask.insertf32x4.256(<8 x float> %x0, <4 x float> %x1, i32 1, <8 x float> zeroinitializer, i8 %x4)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res2, %res3		%res4 = fadd <8 x float> %res2, %res3
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.inserti32x4.256(<8 x i32>, <4 x i32>, i32, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.inserti32x4.256(<8 x i32>, <4 x i32>, i32, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_inserti32x4_256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x3, i8 %x4) {		define <8 x i32>@test_int_x86_avx512_mask_inserti32x4_256(<8 x i32> %x0, <4 x i32> %x1, <8 x i32> %x3, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_inserti32x4_256:		; CHECK-LABEL: test_int_x86_avx512_mask_inserti32x4_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x38,0xd1,0x01]		; CHECK-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x38,0xd1,0x01]
; CHECK-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x38,0xd9,0x01]		; CHECK-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x38,0xd9,0x01]
; CHECK-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x38,0xc1,0x01]		; CHECK-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x38,0xc1,0x01]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]

%res = call <8 x i32> @llvm.x86.avx512.mask.inserti32x4.256(<8 x i32> %x0, <4 x i32> %x1, i32 1, <8 x i32> %x3, i8 %x4)		%res = call <8 x i32> @llvm.x86.avx512.mask.inserti32x4.256(<8 x i32> %x0, <4 x i32> %x1, i32 1, <8 x i32> %x3, i8 %x4)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.inserti32x4.256(<8 x i32> %x0, <4 x i32> %x1, i32 1, <8 x i32> %x3, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.inserti32x4.256(<8 x i32> %x0, <4 x i32> %x1, i32 1, <8 x i32> %x3, i8 -1)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.inserti32x4.256(<8 x i32> %x0, <4 x i32> %x1, i32 1, <8 x i32> zeroinitializer, i8 %x4)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.inserti32x4.256(<8 x i32> %x0, <4 x i32> %x1, i32 1, <8 x i32> zeroinitializer, i8 %x4)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res2, %res3		%res4 = add <8 x i32> %res2, %res3
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pternlog.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i32, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pternlog.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i32, i8)

define <4 x i32>@test_int_x86_avx512_mask_pternlog_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x4) {		define <4 x i32>@test_int_x86_avx512_mask_pternlog_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_pternlog_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pternlog_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd8]		; CHECK-NEXT: vmovdqa %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd8]
; CHECK-NEXT: vpternlogd $33, %xmm2, %xmm1, %xmm3 {%k1} ## encoding: [0x62,0xf3,0x75,0x09,0x25,0xda,0x21]		; CHECK-NEXT: vpternlogd $33, %xmm2, %xmm1, %xmm3 {%k1} ## encoding: [0x62,0xf3,0x75,0x09,0x25,0xda,0x21]
; CHECK-NEXT: vpternlogd $33, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0x75,0x08,0x25,0xc2,0x21]		; CHECK-NEXT: vpternlogd $33, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0x75,0x08,0x25,0xc2,0x21]
; CHECK-NEXT: vpaddd %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pternlog.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i32 33, i8 %x4)		%res = call <4 x i32> @llvm.x86.avx512.mask.pternlog.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i32 33, i8 %x4)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pternlog.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i32 33, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pternlog.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i32 33, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <4 x i32> @llvm.x86.avx512.maskz.pternlog.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i32, i8)		declare <4 x i32> @llvm.x86.avx512.maskz.pternlog.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i32, i8)

define <4 x i32>@test_int_x86_avx512_maskz_pternlog_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x4) {		define <4 x i32>@test_int_x86_avx512_maskz_pternlog_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_maskz_pternlog_d_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_pternlog_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd8]		; CHECK-NEXT: vmovdqa %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd8]
; CHECK-NEXT: vpternlogd $33, %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0x89,0x25,0xda,0x21]		; CHECK-NEXT: vpternlogd $33, %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0x89,0x25,0xda,0x21]
; CHECK-NEXT: vpternlogd $33, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0x75,0x08,0x25,0xc2,0x21]		; CHECK-NEXT: vpternlogd $33, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0x75,0x08,0x25,0xc2,0x21]
; CHECK-NEXT: vpaddd %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x65,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.maskz.pternlog.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i32 33, i8 %x4)		%res = call <4 x i32> @llvm.x86.avx512.maskz.pternlog.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i32 33, i8 %x4)
%res1 = call <4 x i32> @llvm.x86.avx512.maskz.pternlog.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i32 33, i8 -1)		%res1 = call <4 x i32> @llvm.x86.avx512.maskz.pternlog.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i32 33, i8 -1)
%res2 = add <4 x i32> %res, %res1		%res2 = add <4 x i32> %res, %res1
ret <4 x i32> %res2		ret <4 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pternlog.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i32, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pternlog.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i32, i8)

define <8 x i32>@test_int_x86_avx512_mask_pternlog_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x4) {		define <8 x i32>@test_int_x86_avx512_mask_pternlog_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_pternlog_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pternlog_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd8]		; CHECK-NEXT: vmovdqa %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd8]
; CHECK-NEXT: vpternlogd $33, %ymm2, %ymm1, %ymm3 {%k1} ## encoding: [0x62,0xf3,0x75,0x29,0x25,0xda,0x21]		; CHECK-NEXT: vpternlogd $33, %ymm2, %ymm1, %ymm3 {%k1} ## encoding: [0x62,0xf3,0x75,0x29,0x25,0xda,0x21]
; CHECK-NEXT: vpternlogd $33, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x25,0xc2,0x21]		; CHECK-NEXT: vpternlogd $33, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x25,0xc2,0x21]
; CHECK-NEXT: vpaddd %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pternlog.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i32 33, i8 %x4)		%res = call <8 x i32> @llvm.x86.avx512.mask.pternlog.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i32 33, i8 %x4)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pternlog.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i32 33, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pternlog.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i32 33, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <8 x i32> @llvm.x86.avx512.maskz.pternlog.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i32, i8)		declare <8 x i32> @llvm.x86.avx512.maskz.pternlog.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i32, i8)

define <8 x i32>@test_int_x86_avx512_maskz_pternlog_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x4) {		define <8 x i32>@test_int_x86_avx512_maskz_pternlog_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_maskz_pternlog_d_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_pternlog_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd8]		; CHECK-NEXT: vmovdqa %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd8]
; CHECK-NEXT: vpternlogd $33, %ymm2, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0xa9,0x25,0xda,0x21]		; CHECK-NEXT: vpternlogd $33, %ymm2, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0xa9,0x25,0xda,0x21]
; CHECK-NEXT: vpternlogd $33, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x25,0xc2,0x21]		; CHECK-NEXT: vpternlogd $33, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0x75,0x28,0x25,0xc2,0x21]
; CHECK-NEXT: vpaddd %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x65,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.maskz.pternlog.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i32 33, i8 %x4)		%res = call <8 x i32> @llvm.x86.avx512.maskz.pternlog.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i32 33, i8 %x4)
%res1 = call <8 x i32> @llvm.x86.avx512.maskz.pternlog.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i32 33, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.maskz.pternlog.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i32 33, i8 -1)
%res2 = add <8 x i32> %res, %res1		%res2 = add <8 x i32> %res, %res1
ret <8 x i32> %res2		ret <8 x i32> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pternlog.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i32, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pternlog.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i32, i8)

define <2 x i64>@test_int_x86_avx512_mask_pternlog_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x4) {		define <2 x i64>@test_int_x86_avx512_mask_pternlog_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_pternlog_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pternlog_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd8]		; CHECK-NEXT: vmovdqa %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd8]
; CHECK-NEXT: vpternlogq $33, %xmm2, %xmm1, %xmm3 {%k1} ## encoding: [0x62,0xf3,0xf5,0x09,0x25,0xda,0x21]		; CHECK-NEXT: vpternlogq $33, %xmm2, %xmm1, %xmm3 {%k1} ## encoding: [0x62,0xf3,0xf5,0x09,0x25,0xda,0x21]
; CHECK-NEXT: vpternlogq $33, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0xf5,0x08,0x25,0xc2,0x21]		; CHECK-NEXT: vpternlogq $33, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0xf5,0x08,0x25,0xc2,0x21]
; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pternlog.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i32 33, i8 %x4)		%res = call <2 x i64> @llvm.x86.avx512.mask.pternlog.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i32 33, i8 %x4)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pternlog.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i32 33, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pternlog.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i32 33, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <2 x i64> @llvm.x86.avx512.maskz.pternlog.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i32, i8)		declare <2 x i64> @llvm.x86.avx512.maskz.pternlog.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i32, i8)

define <2 x i64>@test_int_x86_avx512_maskz_pternlog_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x4) {		define <2 x i64>@test_int_x86_avx512_maskz_pternlog_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_maskz_pternlog_q_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_pternlog_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xd8]		; CHECK-NEXT: vmovdqa %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xd8]
; CHECK-NEXT: vpternlogq $33, %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0x89,0x25,0xda,0x21]		; CHECK-NEXT: vpternlogq $33, %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0x89,0x25,0xda,0x21]
; CHECK-NEXT: vpternlogq $33, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0xf5,0x08,0x25,0xc2,0x21]		; CHECK-NEXT: vpternlogq $33, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0xf5,0x08,0x25,0xc2,0x21]
; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.maskz.pternlog.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i32 33, i8 %x4)		%res = call <2 x i64> @llvm.x86.avx512.maskz.pternlog.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i32 33, i8 %x4)
%res1 = call <2 x i64> @llvm.x86.avx512.maskz.pternlog.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i32 33, i8 -1)		%res1 = call <2 x i64> @llvm.x86.avx512.maskz.pternlog.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i32 33, i8 -1)
%res2 = add <2 x i64> %res, %res1		%res2 = add <2 x i64> %res, %res1
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pternlog.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i32, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pternlog.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i32, i8)

define <4 x i64>@test_int_x86_avx512_mask_pternlog_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x4) {		define <4 x i64>@test_int_x86_avx512_mask_pternlog_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_pternlog_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pternlog_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd8]		; CHECK-NEXT: vmovdqa %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd8]
; CHECK-NEXT: vpternlogq $33, %ymm2, %ymm1, %ymm3 {%k1} ## encoding: [0x62,0xf3,0xf5,0x29,0x25,0xda,0x21]		; CHECK-NEXT: vpternlogq $33, %ymm2, %ymm1, %ymm3 {%k1} ## encoding: [0x62,0xf3,0xf5,0x29,0x25,0xda,0x21]
; CHECK-NEXT: vpternlogq $33, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0xf5,0x28,0x25,0xc2,0x21]		; CHECK-NEXT: vpternlogq $33, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0xf5,0x28,0x25,0xc2,0x21]
; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pternlog.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i32 33, i8 %x4)		%res = call <4 x i64> @llvm.x86.avx512.mask.pternlog.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i32 33, i8 %x4)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pternlog.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i32 33, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pternlog.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i32 33, i8 -1)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

declare <4 x i64> @llvm.x86.avx512.maskz.pternlog.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i32, i8)		declare <4 x i64> @llvm.x86.avx512.maskz.pternlog.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i32, i8)

define <4 x i64>@test_int_x86_avx512_maskz_pternlog_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x4) {		define <4 x i64>@test_int_x86_avx512_maskz_pternlog_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_maskz_pternlog_q_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_pternlog_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovdqa64 %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xd8]		; CHECK-NEXT: vmovdqa %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xd8]
; CHECK-NEXT: vpternlogq $33, %ymm2, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0xa9,0x25,0xda,0x21]		; CHECK-NEXT: vpternlogq $33, %ymm2, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0xa9,0x25,0xda,0x21]
; CHECK-NEXT: vpternlogq $33, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0xf5,0x28,0x25,0xc2,0x21]		; CHECK-NEXT: vpternlogq $33, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0xf5,0x28,0x25,0xc2,0x21]
; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0xe5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.maskz.pternlog.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i32 33, i8 %x4)		%res = call <4 x i64> @llvm.x86.avx512.maskz.pternlog.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i32 33, i8 %x4)
%res1 = call <4 x i64> @llvm.x86.avx512.maskz.pternlog.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i32 33, i8 -1)		%res1 = call <4 x i64> @llvm.x86.avx512.maskz.pternlog.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i32 33, i8 -1)
%res2 = add <4 x i64> %res, %res1		%res2 = add <4 x i64> %res, %res1
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

define <4 x float> @test_x86_vcvtph2ps_128(<8 x i16> %a0) {		define <4 x float> @test_x86_vcvtph2ps_128(<8 x i16> %a0) {
; CHECK-LABEL: test_x86_vcvtph2ps_128:		; CHECK-LABEL: test_x86_vcvtph2ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vcvtph2ps %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x13,0xc0]		; CHECK-NEXT: vcvtph2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x13,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vcvtph2ps.128(<8 x i16> %a0, <4 x float> zeroinitializer, i8 -1)		%res = call <4 x float> @llvm.x86.avx512.mask.vcvtph2ps.128(<8 x i16> %a0, <4 x float> zeroinitializer, i8 -1)
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_x86_vcvtph2ps_128_rrk(<8 x i16> %a0,<4 x float> %a1, i8 %mask) {		define <4 x float> @test_x86_vcvtph2ps_128_rrk(<8 x i16> %a0,<4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_x86_vcvtph2ps_128_rrk:		; CHECK-LABEL: test_x86_vcvtph2ps_128_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtph2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x13,0xc8]		; CHECK-NEXT: vcvtph2ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x13,0xc8]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vcvtph2ps.128(<8 x i16> %a0, <4 x float> %a1, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.vcvtph2ps.128(<8 x i16> %a0, <4 x float> %a1, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}


define <4 x float> @test_x86_vcvtph2ps_128_rrkz(<8 x i16> %a0, i8 %mask) {		define <4 x float> @test_x86_vcvtph2ps_128_rrkz(<8 x i16> %a0, i8 %mask) {
; CHECK-LABEL: test_x86_vcvtph2ps_128_rrkz:		; CHECK-LABEL: test_x86_vcvtph2ps_128_rrkz:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtph2ps %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x13,0xc0]		; CHECK-NEXT: vcvtph2ps %xmm0, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x13,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.vcvtph2ps.128(<8 x i16> %a0, <4 x float> zeroinitializer, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.mask.vcvtph2ps.128(<8 x i16> %a0, <4 x float> zeroinitializer, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

declare <4 x float> @llvm.x86.avx512.mask.vcvtph2ps.128(<8 x i16>, <4 x float>, i8) nounwind readonly		declare <4 x float> @llvm.x86.avx512.mask.vcvtph2ps.128(<8 x i16>, <4 x float>, i8) nounwind readonly

define <8 x float> @test_x86_vcvtph2ps_256(<8 x i16> %a0) {		define <8 x float> @test_x86_vcvtph2ps_256(<8 x i16> %a0) {
; CHECK-LABEL: test_x86_vcvtph2ps_256:		; CHECK-LABEL: test_x86_vcvtph2ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vcvtph2ps %xmm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x13,0xc0]		; CHECK-NEXT: vcvtph2ps %xmm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x7d,0x13,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vcvtph2ps.256(<8 x i16> %a0, <8 x float> zeroinitializer, i8 -1)		%res = call <8 x float> @llvm.x86.avx512.mask.vcvtph2ps.256(<8 x i16> %a0, <8 x float> zeroinitializer, i8 -1)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_x86_vcvtph2ps_256_rrk(<8 x i16> %a0,<8 x float> %a1, i8 %mask) {		define <8 x float> @test_x86_vcvtph2ps_256_rrk(<8 x i16> %a0,<8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_x86_vcvtph2ps_256_rrk:		; CHECK-LABEL: test_x86_vcvtph2ps_256_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtph2ps %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x13,0xc8]		; CHECK-NEXT: vcvtph2ps %xmm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x13,0xc8]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vcvtph2ps.256(<8 x i16> %a0, <8 x float> %a1, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.vcvtph2ps.256(<8 x i16> %a0, <8 x float> %a1, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_x86_vcvtph2ps_256_rrkz(<8 x i16> %a0, i8 %mask) {		define <8 x float> @test_x86_vcvtph2ps_256_rrkz(<8 x i16> %a0, i8 %mask) {
; CHECK-LABEL: test_x86_vcvtph2ps_256_rrkz:		; CHECK-LABEL: test_x86_vcvtph2ps_256_rrkz:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtph2ps %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x13,0xc0]		; CHECK-NEXT: vcvtph2ps %xmm0, %ymm0 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x13,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.vcvtph2ps.256(<8 x i16> %a0, <8 x float> zeroinitializer, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.mask.vcvtph2ps.256(<8 x i16> %a0, <8 x float> zeroinitializer, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

declare <8 x float> @llvm.x86.avx512.mask.vcvtph2ps.256(<8 x i16>, <8 x float>, i8) nounwind readonly		declare <8 x float> @llvm.x86.avx512.mask.vcvtph2ps.256(<8 x i16>, <8 x float>, i8) nounwind readonly

define <8 x i16> @test_x86_vcvtps2ph_128(<4 x float> %a0, i8 %mask, <8 x i16> %src) {		define <8 x i16> @test_x86_vcvtps2ph_128(<4 x float> %a0, i8 %mask, <8 x i16> %src) {
; CHECK-LABEL: test_x86_vcvtps2ph_128:		; CHECK-LABEL: test_x86_vcvtps2ph_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtps2ph $2, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x1d,0xc1,0x02]		; CHECK-NEXT: vcvtps2ph $2, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x09,0x1d,0xc1,0x02]
; CHECK-NEXT: vcvtps2ph $2, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x1d,0xc2,0x02]		; CHECK-NEXT: vcvtps2ph $2, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0x89,0x1d,0xc2,0x02]
; CHECK-NEXT: vcvtps2ph $2, %xmm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x08,0x1d,0xc0,0x02]		; CHECK-NEXT: vcvtps2ph $2, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x79,0x1d,0xc0,0x02]
; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xfd,0xc2]		; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xfd,0xc2]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0xc5,0xf1,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res1 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128(<4 x float> %a0, i32 2, <8 x i16> zeroinitializer, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128(<4 x float> %a0, i32 2, <8 x i16> zeroinitializer, i8 -1)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128(<4 x float> %a0, i32 2, <8 x i16> zeroinitializer, i8 %mask)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128(<4 x float> %a0, i32 2, <8 x i16> zeroinitializer, i8 %mask)
%res3 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128(<4 x float> %a0, i32 2, <8 x i16> %src, i8 %mask)		%res3 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128(<4 x float> %a0, i32 2, <8 x i16> %src, i8 %mask)
%res0 = add <8 x i16> %res1, %res2		%res0 = add <8 x i16> %res1, %res2
%res = add <8 x i16> %res3, %res0		%res = add <8 x i16> %res3, %res0
ret <8 x i16> %res		ret <8 x i16> %res
}		}

declare <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128(<4 x float>, i32, <8 x i16>, i8) nounwind readonly		declare <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128(<4 x float>, i32, <8 x i16>, i8) nounwind readonly

define <8 x i16> @test_x86_vcvtps2ph_256(<8 x float> %a0, i8 %mask, <8 x i16> %src) {		define <8 x i16> @test_x86_vcvtps2ph_256(<8 x float> %a0, i8 %mask, <8 x i16> %src) {
; CHECK-LABEL: test_x86_vcvtps2ph_256:		; CHECK-LABEL: test_x86_vcvtps2ph_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vcvtps2ph $2, %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x1d,0xc1,0x02]		; CHECK-NEXT: vcvtps2ph $2, %ymm0, %xmm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x1d,0xc1,0x02]
; CHECK-NEXT: vcvtps2ph $2, %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x1d,0xc2,0x02]		; CHECK-NEXT: vcvtps2ph $2, %ymm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x1d,0xc2,0x02]
; CHECK-NEXT: vcvtps2ph $2, %ymm0, %xmm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x1d,0xc0,0x02]		; CHECK-NEXT: vcvtps2ph $2, %ymm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe3,0x7d,0x1d,0xc0,0x02]
; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xfd,0xc2]		; CHECK-NEXT: vpaddw %xmm2, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xfd,0xc2]
; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0xc5,0xf1,0xfd,0xc0]		; CHECK-NEXT: vpaddw %xmm0, %xmm1, %xmm0 ## encoding: [0xc5,0xf1,0xfd,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res1 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256(<8 x float> %a0, i32 2, <8 x i16> zeroinitializer, i8 -1)		%res1 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256(<8 x float> %a0, i32 2, <8 x i16> zeroinitializer, i8 -1)
%res2 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256(<8 x float> %a0, i32 2, <8 x i16> zeroinitializer, i8 %mask)		%res2 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256(<8 x float> %a0, i32 2, <8 x i16> zeroinitializer, i8 %mask)
%res3 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256(<8 x float> %a0, i32 2, <8 x i16> %src, i8 %mask)		%res3 = call <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256(<8 x float> %a0, i32 2, <8 x i16> %src, i8 %mask)
%res0 = add <8 x i16> %res1, %res2		%res0 = add <8 x i16> %res1, %res2
%res = add <8 x i16> %res3, %res0		%res = add <8 x i16> %res3, %res0
Show All 21 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_rsqrt_ps_256_rrk(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_rsqrt_ps_256_rrk(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_rsqrt_ps_256_rrk:		; CHECK-LABEL: test_rsqrt_ps_256_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrsqrt14ps %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x4e,0xc8]		; CHECK-NEXT: vrsqrt14ps %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x4e,0xc8]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.rsqrt14.ps.256(<8 x float> %a0, <8 x float> %a1, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.rsqrt14.ps.256(<8 x float> %a0, <8 x float> %a1, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <4 x float> @test_rsqrt_ps_128_rr(<4 x float> %a0) {		define <4 x float> @test_rsqrt_ps_128_rr(<4 x float> %a0) {
; CHECK-LABEL: test_rsqrt_ps_128_rr:		; CHECK-LABEL: test_rsqrt_ps_128_rr:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 13 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_rsqrt_ps_128_rrk(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_rsqrt_ps_128_rrk(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_rsqrt_ps_128_rrk:		; CHECK-LABEL: test_rsqrt_ps_128_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrsqrt14ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x4e,0xc8]		; CHECK-NEXT: vrsqrt14ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x4e,0xc8]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.rsqrt14.ps.128(<4 x float> %a0, <4 x float> %a1, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.rsqrt14.ps.128(<4 x float> %a0, <4 x float> %a1, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

declare <8 x float> @llvm.x86.avx512.rsqrt14.ps.256(<8 x float>, <8 x float>, i8) nounwind readnone		declare <8 x float> @llvm.x86.avx512.rsqrt14.ps.256(<8 x float>, <8 x float>, i8) nounwind readnone
declare <4 x float> @llvm.x86.avx512.rsqrt14.ps.128(<4 x float>, <4 x float>, i8) nounwind readnone		declare <4 x float> @llvm.x86.avx512.rsqrt14.ps.128(<4 x float>, <4 x float>, i8) nounwind readnone

Show All 16 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <8 x float> %res		ret <8 x float> %res
}		}

define <8 x float> @test_rcp_ps_256_rrk(<8 x float> %a0, <8 x float> %a1, i8 %mask) {		define <8 x float> @test_rcp_ps_256_rrk(<8 x float> %a0, <8 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_rcp_ps_256_rrk:		; CHECK-LABEL: test_rcp_ps_256_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrcp14ps %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x4c,0xc8]		; CHECK-NEXT: vrcp14ps %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x4c,0xc8]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.rcp14.ps.256(<8 x float> %a0, <8 x float> %a1, i8 %mask)		%res = call <8 x float> @llvm.x86.avx512.rcp14.ps.256(<8 x float> %a0, <8 x float> %a1, i8 %mask)
ret <8 x float> %res		ret <8 x float> %res
}		}

define <4 x float> @test_rcp_ps_128_rr(<4 x float> %a0) {		define <4 x float> @test_rcp_ps_128_rr(<4 x float> %a0) {
; CHECK-LABEL: test_rcp_ps_128_rr:		; CHECK-LABEL: test_rcp_ps_128_rr:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 13 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x float> %res		ret <4 x float> %res
}		}

define <4 x float> @test_rcp_ps_128_rrk(<4 x float> %a0, <4 x float> %a1, i8 %mask) {		define <4 x float> @test_rcp_ps_128_rrk(<4 x float> %a0, <4 x float> %a1, i8 %mask) {
; CHECK-LABEL: test_rcp_ps_128_rrk:		; CHECK-LABEL: test_rcp_ps_128_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrcp14ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x4c,0xc8]		; CHECK-NEXT: vrcp14ps %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x4c,0xc8]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.rcp14.ps.128(<4 x float> %a0, <4 x float> %a1, i8 %mask)		%res = call <4 x float> @llvm.x86.avx512.rcp14.ps.128(<4 x float> %a0, <4 x float> %a1, i8 %mask)
ret <4 x float> %res		ret <4 x float> %res
}		}

declare <8 x float> @llvm.x86.avx512.rcp14.ps.256(<8 x float>, <8 x float>, i8) nounwind readnone		declare <8 x float> @llvm.x86.avx512.rcp14.ps.256(<8 x float>, <8 x float>, i8) nounwind readnone
declare <4 x float> @llvm.x86.avx512.rcp14.ps.128(<4 x float>, <4 x float>, i8) nounwind readnone		declare <4 x float> @llvm.x86.avx512.rcp14.ps.128(<4 x float>, <4 x float>, i8) nounwind readnone

Show All 16 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x double> %res		ret <4 x double> %res
}		}

define <4 x double> @test_rsqrt_pd_256_rrk(<4 x double> %a0, <4 x double> %a1, i8 %mask) {		define <4 x double> @test_rsqrt_pd_256_rrk(<4 x double> %a0, <4 x double> %a1, i8 %mask) {
; CHECK-LABEL: test_rsqrt_pd_256_rrk:		; CHECK-LABEL: test_rsqrt_pd_256_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrsqrt14pd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x4e,0xc8]		; CHECK-NEXT: vrsqrt14pd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x4e,0xc8]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.rsqrt14.pd.256(<4 x double> %a0, <4 x double> %a1, i8 %mask)		%res = call <4 x double> @llvm.x86.avx512.rsqrt14.pd.256(<4 x double> %a0, <4 x double> %a1, i8 %mask)
ret <4 x double> %res		ret <4 x double> %res
}		}

define <2 x double> @test_rsqrt_pd_128_rr(<2 x double> %a0) {		define <2 x double> @test_rsqrt_pd_128_rr(<2 x double> %a0) {
; CHECK-LABEL: test_rsqrt_pd_128_rr:		; CHECK-LABEL: test_rsqrt_pd_128_rr:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 13 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x double> %res		ret <2 x double> %res
}		}

define <2 x double> @test_rsqrt_pd_128_rrk(<2 x double> %a0, <2 x double> %a1, i8 %mask) {		define <2 x double> @test_rsqrt_pd_128_rrk(<2 x double> %a0, <2 x double> %a1, i8 %mask) {
; CHECK-LABEL: test_rsqrt_pd_128_rrk:		; CHECK-LABEL: test_rsqrt_pd_128_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrsqrt14pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x4e,0xc8]		; CHECK-NEXT: vrsqrt14pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x4e,0xc8]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.rsqrt14.pd.128(<2 x double> %a0, <2 x double> %a1, i8 %mask)		%res = call <2 x double> @llvm.x86.avx512.rsqrt14.pd.128(<2 x double> %a0, <2 x double> %a1, i8 %mask)
ret <2 x double> %res		ret <2 x double> %res
}		}

declare <4 x double> @llvm.x86.avx512.rsqrt14.pd.256(<4 x double>, <4 x double>, i8) nounwind readnone		declare <4 x double> @llvm.x86.avx512.rsqrt14.pd.256(<4 x double>, <4 x double>, i8) nounwind readnone
declare <2 x double> @llvm.x86.avx512.rsqrt14.pd.128(<2 x double>, <2 x double>, i8) nounwind readnone		declare <2 x double> @llvm.x86.avx512.rsqrt14.pd.128(<2 x double>, <2 x double>, i8) nounwind readnone

Show All 16 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x double> %res		ret <4 x double> %res
}		}

define <4 x double> @test_rcp_pd_256_rrk(<4 x double> %a0, <4 x double> %a1, i8 %mask) {		define <4 x double> @test_rcp_pd_256_rrk(<4 x double> %a0, <4 x double> %a1, i8 %mask) {
; CHECK-LABEL: test_rcp_pd_256_rrk:		; CHECK-LABEL: test_rcp_pd_256_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrcp14pd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x4c,0xc8]		; CHECK-NEXT: vrcp14pd %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x4c,0xc8]
; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xc1]		; CHECK-NEXT: vmovaps %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.rcp14.pd.256(<4 x double> %a0, <4 x double> %a1, i8 %mask)		%res = call <4 x double> @llvm.x86.avx512.rcp14.pd.256(<4 x double> %a0, <4 x double> %a1, i8 %mask)
ret <4 x double> %res		ret <4 x double> %res
}		}

define <2 x double> @test_rcp_pd_128_rr(<2 x double> %a0) {		define <2 x double> @test_rcp_pd_128_rr(<2 x double> %a0) {
; CHECK-LABEL: test_rcp_pd_128_rr:		; CHECK-LABEL: test_rcp_pd_128_rr:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
Show All 13 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x double> %res		ret <2 x double> %res
}		}

define <2 x double> @test_rcp_pd_128_rrk(<2 x double> %a0, <2 x double> %a1, i8 %mask) {		define <2 x double> @test_rcp_pd_128_rrk(<2 x double> %a0, <2 x double> %a1, i8 %mask) {
; CHECK-LABEL: test_rcp_pd_128_rrk:		; CHECK-LABEL: test_rcp_pd_128_rrk:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vrcp14pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x4c,0xc8]		; CHECK-NEXT: vrcp14pd %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x4c,0xc8]
; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xc1]		; CHECK-NEXT: vmovaps %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.rcp14.pd.128(<2 x double> %a0, <2 x double> %a1, i8 %mask)		%res = call <2 x double> @llvm.x86.avx512.rcp14.pd.128(<2 x double> %a0, <2 x double> %a1, i8 %mask)
ret <2 x double> %res		ret <2 x double> %res
}		}

declare <4 x double> @llvm.x86.avx512.rcp14.pd.256(<4 x double>, <4 x double>, i8) nounwind readnone		declare <4 x double> @llvm.x86.avx512.rcp14.pd.256(<4 x double>, <4 x double>, i8) nounwind readnone
declare <2 x double> @llvm.x86.avx512.rcp14.pd.128(<2 x double>, <2 x double>, i8) nounwind readnone		declare <2 x double> @llvm.x86.avx512.rcp14.pd.128(<2 x double>, <2 x double>, i8) nounwind readnone

declare <8 x float> @llvm.x86.avx512.mask.broadcastf32x4.256(<4 x float>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.broadcastf32x4.256(<4 x float>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_broadcastf32x4_256(<4 x float> %x0, <8 x float> %x2, i8 %mask) {		define <8 x float>@test_int_x86_avx512_mask_broadcastf32x4_256(<4 x float> %x0, <8 x float> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_broadcastf32x4_256:		; CHECK-LABEL: test_int_x86_avx512_mask_broadcastf32x4_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: ## kill: %XMM0<def> %XMM0<kill> %YMM0<def>		; CHECK-NEXT: ## kill: %XMM0<def> %XMM0<kill> %YMM0<def>
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshuff32x4 $0, %ymm0, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x23,0xd0,0x00]		; CHECK-NEXT: vshuff32x4 $0, %ymm0, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x23,0xd0,0x00]
; CHECK-NEXT: ## ymm2 {%k1} {z} = ymm0[0,1,2,3,0,1,2,3]		; CHECK-NEXT: ## ymm2 {%k1} {z} = ymm0[0,1,2,3,0,1,2,3]
; CHECK-NEXT: vshuff32x4 $0, %ymm0, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x23,0xc8,0x00]		; CHECK-NEXT: vshuff32x4 $0, %ymm0, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x23,0xc8,0x00]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,2,3,0,1,2,3]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,2,3,0,1,2,3]
; CHECK-NEXT: vshuff32x4 $0, %ymm0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x23,0xc0,0x00]		; CHECK-NEXT: vshuff32x4 $0, %ymm0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x23,0xc0,0x00]
; CHECK-NEXT: ## ymm0 = ymm0[0,1,2,3,0,1,2,3]		; CHECK-NEXT: ## ymm0 = ymm0[0,1,2,3,0,1,2,3]
; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc1]		; CHECK-NEXT: vaddps %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc1]
; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res1 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x4.256(<4 x float> %x0, <8 x float> %x2, i8 -1)		%res1 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x4.256(<4 x float> %x0, <8 x float> %x2, i8 -1)
%res2 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x4.256(<4 x float> %x0, <8 x float> %x2, i8 %mask)		%res2 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x4.256(<4 x float> %x0, <8 x float> %x2, i8 %mask)
%res3 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x4.256(<4 x float> %x0, <8 x float> zeroinitializer, i8 %mask)		%res3 = call <8 x float> @llvm.x86.avx512.mask.broadcastf32x4.256(<4 x float> %x0, <8 x float> zeroinitializer, i8 %mask)
%res4 = fadd <8 x float> %res1, %res2		%res4 = fadd <8 x float> %res1, %res2
%res5 = fadd <8 x float> %res3, %res4		%res5 = fadd <8 x float> %res3, %res4
ret <8 x float> %res5		ret <8 x float> %res5
}		}

declare <8 x i32> @llvm.x86.avx512.mask.broadcasti32x4.256(<4 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.broadcasti32x4.256(<4 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_broadcasti32x4_256(<4 x i32> %x0, <8 x i32> %x2, i8 %mask) {		define <8 x i32>@test_int_x86_avx512_mask_broadcasti32x4_256(<4 x i32> %x0, <8 x i32> %x2, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_broadcasti32x4_256:		; CHECK-LABEL: test_int_x86_avx512_mask_broadcasti32x4_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: ## kill: %XMM0<def> %XMM0<kill> %YMM0<def>		; CHECK-NEXT: ## kill: %XMM0<def> %XMM0<kill> %YMM0<def>
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vshufi32x4 $0, %ymm0, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x43,0xd0,0x00]		; CHECK-NEXT: vshufi32x4 $0, %ymm0, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf3,0x7d,0xa9,0x43,0xd0,0x00]
; CHECK-NEXT: ## ymm2 {%k1} {z} = ymm0[0,1,2,3,0,1,2,3]		; CHECK-NEXT: ## ymm2 {%k1} {z} = ymm0[0,1,2,3,0,1,2,3]
; CHECK-NEXT: vshufi32x4 $0, %ymm0, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x43,0xc8,0x00]		; CHECK-NEXT: vshufi32x4 $0, %ymm0, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf3,0x7d,0x29,0x43,0xc8,0x00]
; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,2,3,0,1,2,3]		; CHECK-NEXT: ## ymm1 {%k1} = ymm0[0,1,2,3,0,1,2,3]
; CHECK-NEXT: vshufi32x4 $0, %ymm0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x43,0xc0,0x00]		; CHECK-NEXT: vshufi32x4 $0, %ymm0, %ymm0, %ymm0 ## encoding: [0x62,0xf3,0x7d,0x28,0x43,0xc0,0x00]
; CHECK-NEXT: ## ymm0 = ymm0[0,1,2,3,0,1,2,3]		; CHECK-NEXT: ## ymm0 = ymm0[0,1,2,3,0,1,2,3]
; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0xfe,0xc1]		; CHECK-NEXT: vpaddd %ymm1, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0xfe,0xc1]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res1 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x4.256(<4 x i32> %x0, <8 x i32> %x2, i8 -1)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x4.256(<4 x i32> %x0, <8 x i32> %x2, i8 -1)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x4.256(<4 x i32> %x0, <8 x i32> %x2, i8 %mask)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x4.256(<4 x i32> %x0, <8 x i32> %x2, i8 %mask)
%res3 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x4.256(<4 x i32> %x0, <8 x i32> zeroinitializer, i8 %mask)		%res3 = call <8 x i32> @llvm.x86.avx512.mask.broadcasti32x4.256(<4 x i32> %x0, <8 x i32> zeroinitializer, i8 %mask)
%res4 = add <8 x i32> %res1, %res2		%res4 = add <8 x i32> %res1, %res2
%res5 = add <8 x i32> %res3, %res4		%res5 = add <8 x i32> %res3, %res4
ret <8 x i32> %res5		ret <8 x i32> %res5
}		}

declare <4 x i32> @llvm.x86.avx512.mask.prorv.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.prorv.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_prorv_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_prorv_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prorv_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_prorv_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vprorvd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x14,0xd1]		; CHECK-NEXT: vprorvd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x14,0xd1]
; CHECK-NEXT: vprorvd %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x14,0xd9]		; CHECK-NEXT: vprorvd %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x14,0xd9]
; CHECK-NEXT: vprorvd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x14,0xc1]		; CHECK-NEXT: vprorvd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x14,0xc1]
; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xcb]		; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xcb]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.prorv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.prorv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.prorv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.prorv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.prorv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.prorv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.prorv.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.prorv.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_prorv_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_prorv_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prorv_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_prorv_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vprorvd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x14,0xd1]		; CHECK-NEXT: vprorvd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x14,0xd1]
; CHECK-NEXT: vprorvd %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x14,0xd9]		; CHECK-NEXT: vprorvd %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x14,0xd9]
; CHECK-NEXT: vprorvd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x14,0xc1]		; CHECK-NEXT: vprorvd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x14,0xc1]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xcb]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xcb]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.prorv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.prorv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.prorv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.prorv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.prorv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.prorv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.prorv.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.prorv.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_prorv_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_prorv_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prorv_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_prorv_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vprorvq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x14,0xd1]		; CHECK-NEXT: vprorvq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x14,0xd1]
; CHECK-NEXT: vprorvq %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x14,0xd9]		; CHECK-NEXT: vprorvq %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x14,0xd9]
; CHECK-NEXT: vprorvq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x14,0xc1]		; CHECK-NEXT: vprorvq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x14,0xc1]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xcb]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xcb]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.prorv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.prorv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.prorv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.prorv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.prorv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.prorv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.prorv.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.prorv.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_prorv_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_prorv_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prorv_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_prorv_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vprorvq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x14,0xd1]		; CHECK-NEXT: vprorvq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x14,0xd1]
; CHECK-NEXT: vprorvq %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x14,0xd9]		; CHECK-NEXT: vprorvq %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x14,0xd9]
; CHECK-NEXT: vprorvq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x14,0xc1]		; CHECK-NEXT: vprorvq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x14,0xc1]
; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xcb]		; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xcb]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.prorv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.prorv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.prorv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.prorv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.prorv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.prorv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.prol.d.128(<4 x i32>, i32, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.prol.d.128(<4 x i32>, i32, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_prol_d_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_prol_d_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prol_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_prol_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vprold $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x72,0xc8,0x03]		; CHECK-NEXT: vprold $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x72,0xc8,0x03]
; CHECK-NEXT: vprold $3, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf1,0x6d,0x89,0x72,0xc8,0x03]		; CHECK-NEXT: vprold $3, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf1,0x6d,0x89,0x72,0xc8,0x03]
; CHECK-NEXT: vprold $3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xc8,0x03]		; CHECK-NEXT: vprold $3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xc8,0x03]
; CHECK-NEXT: vpaddd %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xca]		; CHECK-NEXT: vpaddd %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xca]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.prol.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.prol.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.prol.d.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.prol.d.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.prol.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.prol.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.prol.d.256(<8 x i32>, i32, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.prol.d.256(<8 x i32>, i32, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_prol_d_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_prol_d_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prol_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_prol_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vprold $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x72,0xc8,0x03]		; CHECK-NEXT: vprold $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x72,0xc8,0x03]
; CHECK-NEXT: vprold $3, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf1,0x6d,0xa9,0x72,0xc8,0x03]		; CHECK-NEXT: vprold $3, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf1,0x6d,0xa9,0x72,0xc8,0x03]
; CHECK-NEXT: vprold $3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x72,0xc8,0x03]		; CHECK-NEXT: vprold $3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x72,0xc8,0x03]
; CHECK-NEXT: vpaddd %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xca]		; CHECK-NEXT: vpaddd %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xca]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.prol.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.prol.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.prol.d.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.prol.d.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.prol.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.prol.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.prol.q.128(<2 x i64>, i32, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.prol.q.128(<2 x i64>, i32, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_prol_q_128(<2 x i64> %x0, i32 %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_prol_q_128(<2 x i64> %x0, i32 %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prol_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_prol_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vprolq $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x72,0xc8,0x03]		; CHECK-NEXT: vprolq $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x72,0xc8,0x03]
; CHECK-NEXT: vprolq $3, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf1,0xed,0x89,0x72,0xc8,0x03]		; CHECK-NEXT: vprolq $3, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf1,0xed,0x89,0x72,0xc8,0x03]
; CHECK-NEXT: vprolq $3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x72,0xc8,0x03]		; CHECK-NEXT: vprolq $3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x72,0xc8,0x03]
; CHECK-NEXT: vpaddq %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xca]		; CHECK-NEXT: vpaddq %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xca]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.prol.q.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.prol.q.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.prol.q.128(<2 x i64> %x0, i32 3, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.prol.q.128(<2 x i64> %x0, i32 3, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.prol.q.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.prol.q.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.prol.q.256(<4 x i64>, i32, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.prol.q.256(<4 x i64>, i32, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_prol_q_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_prol_q_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prol_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_prol_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vprolq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x72,0xc8,0x03]		; CHECK-NEXT: vprolq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x72,0xc8,0x03]
; CHECK-NEXT: vprolq $3, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf1,0xed,0xa9,0x72,0xc8,0x03]		; CHECK-NEXT: vprolq $3, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf1,0xed,0xa9,0x72,0xc8,0x03]
; CHECK-NEXT: vprolq $3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x72,0xc8,0x03]		; CHECK-NEXT: vprolq $3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x72,0xc8,0x03]
; CHECK-NEXT: vpaddq %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xca]		; CHECK-NEXT: vpaddq %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xca]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.prol.q.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.prol.q.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.prol.q.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.prol.q.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.prol.q.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.prol.q.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.prolv.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.prolv.d.128(<4 x i32>, <4 x i32>, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_prolv_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_prolv_d_128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prolv_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_prolv_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vprolvd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x15,0xd1]		; CHECK-NEXT: vprolvd %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x15,0xd1]
; CHECK-NEXT: vprolvd %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x15,0xd9]		; CHECK-NEXT: vprolvd %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x15,0xd9]
; CHECK-NEXT: vprolvd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x15,0xc1]		; CHECK-NEXT: vprolvd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x15,0xc1]
; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xcb]		; CHECK-NEXT: vpaddd %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xcb]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.prolv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.prolv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.prolv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.prolv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.prolv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.prolv.d.128(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.prolv.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.prolv.d.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_prolv_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_prolv_d_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prolv_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_prolv_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vprolvd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x15,0xd1]		; CHECK-NEXT: vprolvd %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x15,0xd1]
; CHECK-NEXT: vprolvd %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x15,0xd9]		; CHECK-NEXT: vprolvd %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x15,0xd9]
; CHECK-NEXT: vprolvd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x15,0xc1]		; CHECK-NEXT: vprolvd %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0x7d,0x28,0x15,0xc1]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xcb]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xcb]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.prolv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.prolv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.prolv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.prolv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.prolv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.prolv.d.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.prolv.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.prolv.q.128(<2 x i64>, <2 x i64>, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_prolv_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_prolv_q_128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prolv_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_prolv_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vprolvq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x15,0xd1]		; CHECK-NEXT: vprolvq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x15,0xd1]
; CHECK-NEXT: vprolvq %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x15,0xd9]		; CHECK-NEXT: vprolvq %xmm1, %xmm0, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x15,0xd9]
; CHECK-NEXT: vprolvq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x15,0xc1]		; CHECK-NEXT: vprolvq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x15,0xc1]
; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm1 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xcb]		; CHECK-NEXT: vpaddq %xmm3, %xmm2, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xcb]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.prolv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.prolv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.prolv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.prolv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.prolv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.prolv.q.128(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.prolv.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.prolv.q.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_prolv_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_prolv_q_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_prolv_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_prolv_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vprolvq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x15,0xd1]		; CHECK-NEXT: vprolvq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x15,0xd1]
; CHECK-NEXT: vprolvq %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x15,0xd9]		; CHECK-NEXT: vprolvq %ymm1, %ymm0, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x15,0xd9]
; CHECK-NEXT: vprolvq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x15,0xc1]		; CHECK-NEXT: vprolvq %ymm1, %ymm0, %ymm0 ## encoding: [0x62,0xf2,0xfd,0x28,0x15,0xc1]
; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xcb]		; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xcb]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.prolv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.prolv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.prolv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.prolv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.prolv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.prolv.q.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pror.d.128(<4 x i32>, i32, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pror.d.128(<4 x i32>, i32, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pror_d_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {		define <4 x i32>@test_int_x86_avx512_mask_pror_d_128(<4 x i32> %x0, i32 %x1, <4 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pror_d_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pror_d_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vprord $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x72,0xc0,0x03]		; CHECK-NEXT: vprord $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x09,0x72,0xc0,0x03]
; CHECK-NEXT: vprord $3, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf1,0x6d,0x89,0x72,0xc0,0x03]		; CHECK-NEXT: vprord $3, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf1,0x6d,0x89,0x72,0xc0,0x03]
; CHECK-NEXT: vprord $3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xc0,0x03]		; CHECK-NEXT: vprord $3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xc0,0x03]
; CHECK-NEXT: vpaddd %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xca]		; CHECK-NEXT: vpaddd %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xca]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pror.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)		%res = call <4 x i32> @llvm.x86.avx512.mask.pror.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 %x3)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pror.d.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pror.d.128(<4 x i32> %x0, i32 3, <4 x i32> zeroinitializer, i8 %x3)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pror.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pror.d.128(<4 x i32> %x0, i32 3, <4 x i32> %x2, i8 -1)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res3, %res2		%res4 = add <4 x i32> %res3, %res2
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.pror.d.256(<8 x i32>, i32, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.pror.d.256(<8 x i32>, i32, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_pror_d_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_pror_d_256(<8 x i32> %x0, i32 %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pror_d_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pror_d_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vprord $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x72,0xc0,0x03]		; CHECK-NEXT: vprord $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0x75,0x29,0x72,0xc0,0x03]
; CHECK-NEXT: vprord $3, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf1,0x6d,0xa9,0x72,0xc0,0x03]		; CHECK-NEXT: vprord $3, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf1,0x6d,0xa9,0x72,0xc0,0x03]
; CHECK-NEXT: vprord $3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x72,0xc0,0x03]		; CHECK-NEXT: vprord $3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7d,0x28,0x72,0xc0,0x03]
; CHECK-NEXT: vpaddd %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xca]		; CHECK-NEXT: vpaddd %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xca]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pror.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.pror.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pror.d.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pror.d.256(<8 x i32> %x0, i32 3, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.pror.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.pror.d.256(<8 x i32> %x0, i32 3, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pror.q.128(<2 x i64>, i32, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pror.q.128(<2 x i64>, i32, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pror_q_128(<2 x i64> %x0, i32 %x1, <2 x i64> %x2, i8 %x3) {		define <2 x i64>@test_int_x86_avx512_mask_pror_q_128(<2 x i64> %x0, i32 %x1, <2 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pror_q_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pror_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vprorq $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x72,0xc0,0x03]		; CHECK-NEXT: vprorq $3, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x72,0xc0,0x03]
; CHECK-NEXT: vprorq $3, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf1,0xed,0x89,0x72,0xc0,0x03]		; CHECK-NEXT: vprorq $3, %xmm0, %xmm2 {%k1} {z} ## encoding: [0x62,0xf1,0xed,0x89,0x72,0xc0,0x03]
; CHECK-NEXT: vprorq $3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x72,0xc0,0x03]		; CHECK-NEXT: vprorq $3, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x72,0xc0,0x03]
; CHECK-NEXT: vpaddq %xmm2, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xca]		; CHECK-NEXT: vpaddq %xmm2, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xca]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pror.q.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 %x3)		%res = call <2 x i64> @llvm.x86.avx512.mask.pror.q.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 %x3)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pror.q.128(<2 x i64> %x0, i32 3, <2 x i64> zeroinitializer, i8 %x3)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pror.q.128(<2 x i64> %x0, i32 3, <2 x i64> zeroinitializer, i8 %x3)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.pror.q.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 -1)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.pror.q.128(<2 x i64> %x0, i32 3, <2 x i64> %x2, i8 -1)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res3, %res2		%res4 = add <2 x i64> %res3, %res2
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pror.q.256(<4 x i64>, i32, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pror.q.256(<4 x i64>, i32, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pror_q_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_pror_q_256(<4 x i64> %x0, i32 %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_pror_q_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pror_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vprorq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x72,0xc0,0x03]		; CHECK-NEXT: vprorq $3, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x72,0xc0,0x03]
; CHECK-NEXT: vprorq $3, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf1,0xed,0xa9,0x72,0xc0,0x03]		; CHECK-NEXT: vprorq $3, %ymm0, %ymm2 {%k1} {z} ## encoding: [0x62,0xf1,0xed,0xa9,0x72,0xc0,0x03]
; CHECK-NEXT: vprorq $3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x72,0xc0,0x03]		; CHECK-NEXT: vprorq $3, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x72,0xc0,0x03]
; CHECK-NEXT: vpaddq %ymm2, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xca]		; CHECK-NEXT: vpaddq %ymm2, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xca]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pror.q.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.pror.q.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pror.q.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pror.q.256(<4 x i64> %x0, i32 3, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.pror.q.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.pror.q.256(<4 x i64> %x0, i32 3, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <4 x double> @llvm.x86.avx512.mask.permvar.df.256(<4 x double>, <4 x i64>, <4 x double>, i8)		declare <4 x double> @llvm.x86.avx512.mask.permvar.df.256(<4 x double>, <4 x i64>, <4 x double>, i8)

define <4 x double>@test_int_x86_avx512_mask_permvar_df_256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3) {		define <4 x double>@test_int_x86_avx512_mask_permvar_df_256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_permvar_df_256:		; CHECK-LABEL: test_int_x86_avx512_mask_permvar_df_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermpd %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xf5,0x29,0x16,0xd0]		; CHECK-NEXT: vpermpd %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xf5,0x29,0x16,0xd0]
; CHECK-NEXT: vpermpd %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0xa9,0x16,0xd8]		; CHECK-NEXT: vpermpd %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0xa9,0x16,0xd8]
; CHECK-NEXT: vpermpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0xf5,0x28,0x16,0xc0]		; CHECK-NEXT: vpermpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0xf5,0x28,0x16,0xc0]
; CHECK-NEXT: vaddpd %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0xed,0x28,0x58,0xcb]		; CHECK-NEXT: vaddpd %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0x58,0xcb]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.permvar.df.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3)		%res = call <4 x double> @llvm.x86.avx512.mask.permvar.df.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 %x3)
%res1 = call <4 x double> @llvm.x86.avx512.mask.permvar.df.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> zeroinitializer, i8 %x3)		%res1 = call <4 x double> @llvm.x86.avx512.mask.permvar.df.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> zeroinitializer, i8 %x3)
%res2 = call <4 x double> @llvm.x86.avx512.mask.permvar.df.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 -1)		%res2 = call <4 x double> @llvm.x86.avx512.mask.permvar.df.256(<4 x double> %x0, <4 x i64> %x1, <4 x double> %x2, i8 -1)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res3, %res2		%res4 = fadd <4 x double> %res3, %res2
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.permvar.di.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.permvar.di.256(<4 x i64>, <4 x i64>, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_permvar_di_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {		define <4 x i64>@test_int_x86_avx512_mask_permvar_di_256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_permvar_di_256:		; CHECK-LABEL: test_int_x86_avx512_mask_permvar_di_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermq %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xf5,0x29,0x36,0xd0]		; CHECK-NEXT: vpermq %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xf5,0x29,0x36,0xd0]
; CHECK-NEXT: vpermq %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0xa9,0x36,0xd8]		; CHECK-NEXT: vpermq %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0xa9,0x36,0xd8]
; CHECK-NEXT: vpermq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0xf5,0x28,0x36,0xc0]		; CHECK-NEXT: vpermq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0xf5,0x28,0x36,0xc0]
; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xcb]		; CHECK-NEXT: vpaddq %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xcb]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.permvar.di.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)		%res = call <4 x i64> @llvm.x86.avx512.mask.permvar.di.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 %x3)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.permvar.di.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.permvar.di.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> zeroinitializer, i8 %x3)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.permvar.di.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.permvar.di.256(<4 x i64> %x0, <4 x i64> %x1, <4 x i64> %x2, i8 -1)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res3, %res2		%res4 = add <4 x i64> %res3, %res2
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.permvar.sf.256(<8 x float>, <8 x i32>, <8 x float>, i8)		declare <8 x float> @llvm.x86.avx512.mask.permvar.sf.256(<8 x float>, <8 x i32>, <8 x float>, i8)

define <8 x float>@test_int_x86_avx512_mask_permvar_sf_256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3) {		define <8 x float>@test_int_x86_avx512_mask_permvar_sf_256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_permvar_sf_256:		; CHECK-LABEL: test_int_x86_avx512_mask_permvar_sf_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermps %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x75,0x29,0x16,0xd0]		; CHECK-NEXT: vpermps %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x75,0x29,0x16,0xd0]
; CHECK-NEXT: vpermps %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0xa9,0x16,0xd8]		; CHECK-NEXT: vpermps %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0xa9,0x16,0xd8]
; CHECK-NEXT: vpermps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0x75,0x28,0x16,0xc0]		; CHECK-NEXT: vpermps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x75,0x16,0xc0]
; CHECK-NEXT: vaddps %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6c,0x28,0x58,0xcb]		; CHECK-NEXT: vaddps %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xec,0x58,0xcb]
; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x74,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf4,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.permvar.sf.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3)		%res = call <8 x float> @llvm.x86.avx512.mask.permvar.sf.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 %x3)
%res1 = call <8 x float> @llvm.x86.avx512.mask.permvar.sf.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> zeroinitializer, i8 %x3)		%res1 = call <8 x float> @llvm.x86.avx512.mask.permvar.sf.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> zeroinitializer, i8 %x3)
%res2 = call <8 x float> @llvm.x86.avx512.mask.permvar.sf.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 -1)		%res2 = call <8 x float> @llvm.x86.avx512.mask.permvar.sf.256(<8 x float> %x0, <8 x i32> %x1, <8 x float> %x2, i8 -1)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res3, %res2		%res4 = fadd <8 x float> %res3, %res2
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <8 x i32> @llvm.x86.avx512.mask.permvar.si.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)		declare <8 x i32> @llvm.x86.avx512.mask.permvar.si.256(<8 x i32>, <8 x i32>, <8 x i32>, i8)

define <8 x i32>@test_int_x86_avx512_mask_permvar_si_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {		define <8 x i32>@test_int_x86_avx512_mask_permvar_si_256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3) {
; CHECK-LABEL: test_int_x86_avx512_mask_permvar_si_256:		; CHECK-LABEL: test_int_x86_avx512_mask_permvar_si_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpermd %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x75,0x29,0x36,0xd0]		; CHECK-NEXT: vpermd %ymm0, %ymm1, %ymm2 {%k1} ## encoding: [0x62,0xf2,0x75,0x29,0x36,0xd0]
; CHECK-NEXT: vpermd %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0xa9,0x36,0xd8]		; CHECK-NEXT: vpermd %ymm0, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0xa9,0x36,0xd8]
; CHECK-NEXT: vpermd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf2,0x75,0x28,0x36,0xc0]		; CHECK-NEXT: vpermd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x75,0x36,0xc0]
; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm1 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xcb]		; CHECK-NEXT: vpaddd %ymm3, %ymm2, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xcb]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.permvar.si.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)		%res = call <8 x i32> @llvm.x86.avx512.mask.permvar.si.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 %x3)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.permvar.si.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.permvar.si.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> zeroinitializer, i8 %x3)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.permvar.si.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.permvar.si.256(<8 x i32> %x0, <8 x i32> %x1, <8 x i32> %x2, i8 -1)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res3, %res2		%res4 = add <8 x i32> %res3, %res2
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <2 x double> @llvm.x86.avx512.mask.fixupimm.pd.128(<2 x double>, <2 x double>, <2 x i64>, i32, i8)		declare <2 x double> @llvm.x86.avx512.mask.fixupimm.pd.128(<2 x double>, <2 x double>, <2 x i64>, i32, i8)

define <2 x double>@test_int_x86_avx512_mask_fixupimm_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i8 %x4) {		define <2 x double>@test_int_x86_avx512_mask_fixupimm_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_fixupimm_pd_128:		; CHECK-LABEL: test_int_x86_avx512_mask_fixupimm_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd8]		; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd8]
; CHECK-NEXT: vfixupimmpd $5, %xmm2, %xmm1, %xmm3 {%k1} ## encoding: [0x62,0xf3,0xf5,0x09,0x54,0xda,0x05]		; CHECK-NEXT: vfixupimmpd $5, %xmm2, %xmm1, %xmm3 {%k1} ## encoding: [0x62,0xf3,0xf5,0x09,0x54,0xda,0x05]
; CHECK-NEXT: vpxord %xmm4, %xmm4, %xmm4 ## encoding: [0x62,0xf1,0x5d,0x08,0xef,0xe4]		; CHECK-NEXT: vpxor %xmm4, %xmm4, %xmm4 ## EVEX TO VEX Compression encoding: [0xc5,0xd9,0xef,0xe4]
; CHECK-NEXT: vfixupimmpd $4, %xmm2, %xmm1, %xmm4 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0x89,0x54,0xe2,0x04]		; CHECK-NEXT: vfixupimmpd $4, %xmm2, %xmm1, %xmm4 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0x89,0x54,0xe2,0x04]
; CHECK-NEXT: vfixupimmpd $3, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0xf5,0x08,0x54,0xc2,0x03]		; CHECK-NEXT: vfixupimmpd $3, %xmm2, %xmm1, %xmm0 ## encoding: [0x62,0xf3,0xf5,0x08,0x54,0xc2,0x03]
; CHECK-NEXT: vaddpd %xmm4, %xmm3, %xmm1 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xcc]		; CHECK-NEXT: vaddpd %xmm4, %xmm3, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xcc]
; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.mask.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1,<2 x i64> %x2, i32 5, i8 %x4)		%res = call <2 x double> @llvm.x86.avx512.mask.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1,<2 x i64> %x2, i32 5, i8 %x4)
%res1 = call <2 x double> @llvm.x86.avx512.mask.fixupimm.pd.128(<2 x double> zeroinitializer, <2 x double> %x1, <2 x i64> %x2, i32 4, i8 %x4)		%res1 = call <2 x double> @llvm.x86.avx512.mask.fixupimm.pd.128(<2 x double> zeroinitializer, <2 x double> %x1, <2 x i64> %x2, i32 4, i8 %x4)
%res2 = call <2 x double> @llvm.x86.avx512.mask.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i32 3, i8 -1)		%res2 = call <2 x double> @llvm.x86.avx512.mask.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i32 3, i8 -1)
%res3 = fadd <2 x double> %res, %res1		%res3 = fadd <2 x double> %res, %res1
%res4 = fadd <2 x double> %res3, %res2		%res4 = fadd <2 x double> %res3, %res2
ret <2 x double> %res4		ret <2 x double> %res4
}		}

declare <2 x double> @llvm.x86.avx512.maskz.fixupimm.pd.128(<2 x double>, <2 x double>, <2 x i64>, i32, i8)		declare <2 x double> @llvm.x86.avx512.maskz.fixupimm.pd.128(<2 x double>, <2 x double>, <2 x i64>, i32, i8)

define <2 x double>@test_int_x86_avx512_maskz_fixupimm_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i8 %x4) {		define <2 x double>@test_int_x86_avx512_maskz_fixupimm_pd_128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_maskz_fixupimm_pd_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_fixupimm_pd_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0x62,0xf1,0xfd,0x08,0x28,0xd8]		; CHECK-NEXT: vmovapd %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x28,0xd8]
; CHECK-NEXT: vfixupimmpd $5, %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0x89,0x54,0xda,0x05]		; CHECK-NEXT: vfixupimmpd $5, %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0x89,0x54,0xda,0x05]
; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]		; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
; CHECK-NEXT: vfixupimmpd $3, %xmm2, %xmm1, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0x89,0x54,0xc2,0x03]		; CHECK-NEXT: vfixupimmpd $3, %xmm2, %xmm1, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0x89,0x54,0xc2,0x03]
; CHECK-NEXT: vaddpd %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0xe5,0x08,0x58,0xc0]		; CHECK-NEXT: vaddpd %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe1,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x double> @llvm.x86.avx512.maskz.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i32 5, i8 %x4)		%res = call <2 x double> @llvm.x86.avx512.maskz.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i32 5, i8 %x4)
%res1 = call <2 x double> @llvm.x86.avx512.maskz.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x i64> zeroinitializer, i32 3, i8 %x4)		%res1 = call <2 x double> @llvm.x86.avx512.maskz.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x i64> zeroinitializer, i32 3, i8 %x4)
;%res2 = call <2 x double> @llvm.x86.avx512.maskz.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i32 4, i8 -1)		;%res2 = call <2 x double> @llvm.x86.avx512.maskz.fixupimm.pd.128(<2 x double> %x0, <2 x double> %x1, <2 x i64> %x2, i32 4, i8 -1)
%res3 = fadd <2 x double> %res, %res1		%res3 = fadd <2 x double> %res, %res1
;%res4 = fadd <2 x double> %res3, %res2		;%res4 = fadd <2 x double> %res3, %res2
ret <2 x double> %res3		ret <2 x double> %res3
}		}

declare <4 x double> @llvm.x86.avx512.mask.fixupimm.pd.256(<4 x double>, <4 x double>, <4 x i64>, i32, i8)		declare <4 x double> @llvm.x86.avx512.mask.fixupimm.pd.256(<4 x double>, <4 x double>, <4 x i64>, i32, i8)

define <4 x double>@test_int_x86_avx512_mask_fixupimm_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i8 %x4) {		define <4 x double>@test_int_x86_avx512_mask_fixupimm_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_fixupimm_pd_256:		; CHECK-LABEL: test_int_x86_avx512_mask_fixupimm_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd8]		; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd8]
; CHECK-NEXT: vfixupimmpd $4, %ymm2, %ymm1, %ymm3 {%k1} ## encoding: [0x62,0xf3,0xf5,0x29,0x54,0xda,0x04]		; CHECK-NEXT: vfixupimmpd $4, %ymm2, %ymm1, %ymm3 {%k1} ## encoding: [0x62,0xf3,0xf5,0x29,0x54,0xda,0x04]
; CHECK-NEXT: vpxord %ymm4, %ymm4, %ymm4 ## encoding: [0x62,0xf1,0x5d,0x28,0xef,0xe4]		; CHECK-NEXT: vpxor %ymm4, %ymm4, %ymm4 ## EVEX TO VEX Compression encoding: [0xc5,0xdd,0xef,0xe4]
; CHECK-NEXT: vfixupimmpd $5, %ymm2, %ymm1, %ymm4 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0xa9,0x54,0xe2,0x05]		; CHECK-NEXT: vfixupimmpd $5, %ymm2, %ymm1, %ymm4 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0xa9,0x54,0xe2,0x05]
; CHECK-NEXT: vfixupimmpd $3, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0xf5,0x28,0x54,0xc2,0x03]		; CHECK-NEXT: vfixupimmpd $3, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0xf5,0x28,0x54,0xc2,0x03]
; CHECK-NEXT: vaddpd %ymm4, %ymm3, %ymm1 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xcc]		; CHECK-NEXT: vaddpd %ymm4, %ymm3, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xcc]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.mask.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i32 4, i8 %x4)		%res = call <4 x double> @llvm.x86.avx512.mask.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i32 4, i8 %x4)
%res1 = call <4 x double> @llvm.x86.avx512.mask.fixupimm.pd.256(<4 x double> zeroinitializer, <4 x double> %x1, <4 x i64> %x2 , i32 5, i8 %x4)		%res1 = call <4 x double> @llvm.x86.avx512.mask.fixupimm.pd.256(<4 x double> zeroinitializer, <4 x double> %x1, <4 x i64> %x2 , i32 5, i8 %x4)
%res2 = call <4 x double> @llvm.x86.avx512.mask.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i32 3, i8 -1)		%res2 = call <4 x double> @llvm.x86.avx512.mask.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i32 3, i8 -1)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res3, %res2		%res4 = fadd <4 x double> %res3, %res2
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <4 x double> @llvm.x86.avx512.maskz.fixupimm.pd.256(<4 x double>, <4 x double>, <4 x i64>, i32, i8)		declare <4 x double> @llvm.x86.avx512.maskz.fixupimm.pd.256(<4 x double>, <4 x double>, <4 x i64>, i32, i8)

define <4 x double>@test_int_x86_avx512_maskz_fixupimm_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i8 %x4) {		define <4 x double>@test_int_x86_avx512_maskz_fixupimm_pd_256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_maskz_fixupimm_pd_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_fixupimm_pd_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xd8]		; CHECK-NEXT: vmovapd %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xd8]
; CHECK-NEXT: vfixupimmpd $5, %ymm2, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0xa9,0x54,0xda,0x05]		; CHECK-NEXT: vfixupimmpd $5, %ymm2, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0xa9,0x54,0xda,0x05]
; CHECK-NEXT: vpxord %ymm4, %ymm4, %ymm4 ## encoding: [0x62,0xf1,0x5d,0x28,0xef,0xe4]		; CHECK-NEXT: vpxor %ymm4, %ymm4, %ymm4 ## EVEX TO VEX Compression encoding: [0xc5,0xdd,0xef,0xe4]
; CHECK-NEXT: vmovapd %ymm0, %ymm5 ## encoding: [0x62,0xf1,0xfd,0x28,0x28,0xe8]		; CHECK-NEXT: vmovapd %ymm0, %ymm5 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x28,0xe8]
; CHECK-NEXT: vfixupimmpd $4, %ymm4, %ymm1, %ymm5 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0xa9,0x54,0xec,0x04]		; CHECK-NEXT: vfixupimmpd $4, %ymm4, %ymm1, %ymm5 {%k1} {z} ## encoding: [0x62,0xf3,0xf5,0xa9,0x54,0xec,0x04]
; CHECK-NEXT: vfixupimmpd $3, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0xf5,0x28,0x54,0xc2,0x03]		; CHECK-NEXT: vfixupimmpd $3, %ymm2, %ymm1, %ymm0 ## encoding: [0x62,0xf3,0xf5,0x28,0x54,0xc2,0x03]
; CHECK-NEXT: vaddpd %ymm5, %ymm3, %ymm1 ## encoding: [0x62,0xf1,0xe5,0x28,0x58,0xcd]		; CHECK-NEXT: vaddpd %ymm5, %ymm3, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xe5,0x58,0xcd]
; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0x58,0xc0]		; CHECK-NEXT: vaddpd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0x58,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x double> @llvm.x86.avx512.maskz.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i32 5, i8 %x4)		%res = call <4 x double> @llvm.x86.avx512.maskz.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i32 5, i8 %x4)
%res1 = call <4 x double> @llvm.x86.avx512.maskz.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> zeroinitializer, i32 4, i8 %x4)		%res1 = call <4 x double> @llvm.x86.avx512.maskz.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> zeroinitializer, i32 4, i8 %x4)
%res2 = call <4 x double> @llvm.x86.avx512.maskz.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i32 3, i8 -1)		%res2 = call <4 x double> @llvm.x86.avx512.maskz.fixupimm.pd.256(<4 x double> %x0, <4 x double> %x1, <4 x i64> %x2, i32 3, i8 -1)
%res3 = fadd <4 x double> %res, %res1		%res3 = fadd <4 x double> %res, %res1
%res4 = fadd <4 x double> %res3, %res2		%res4 = fadd <4 x double> %res3, %res2
ret <4 x double> %res4		ret <4 x double> %res4
}		}

declare <4 x float> @llvm.x86.avx512.mask.fixupimm.ps.128(<4 x float>, <4 x float>, <4 x i32>, i32, i8)		declare <4 x float> @llvm.x86.avx512.mask.fixupimm.ps.128(<4 x float>, <4 x float>, <4 x i32>, i32, i8)

define <4 x float>@test_int_x86_avx512_mask_fixupimm_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i8 %x4) {		define <4 x float>@test_int_x86_avx512_mask_fixupimm_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_fixupimm_ps_128:		; CHECK-LABEL: test_int_x86_avx512_mask_fixupimm_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd8]		; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd8]
; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm3 {%k1} ## encoding: [0x62,0xf3,0x75,0x09,0x54,0xda,0x05]		; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm3 {%k1} ## encoding: [0x62,0xf3,0x75,0x09,0x54,0xda,0x05]
; CHECK-NEXT: vmovaps %xmm0, %xmm4 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xe0]		; CHECK-NEXT: vmovaps %xmm0, %xmm4 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xe0]
; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm4 ## encoding: [0x62,0xf3,0x75,0x08,0x54,0xe2,0x05]		; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm4 ## encoding: [0x62,0xf3,0x75,0x08,0x54,0xe2,0x05]
; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]		; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf3,0x75,0x09,0x54,0xc2,0x05]		; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf3,0x75,0x09,0x54,0xc2,0x05]
; CHECK-NEXT: vaddps %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc0]
; CHECK-NEXT: vaddps %xmm4, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x58,0xc4]		; CHECK-NEXT: vaddps %xmm4, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x58,0xc4]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.mask.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i32 5, i8 %x4)		%res = call <4 x float> @llvm.x86.avx512.mask.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i32 5, i8 %x4)
%res1 = call <4 x float> @llvm.x86.avx512.mask.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> zeroinitializer, i32 5, i8 %x4)		%res1 = call <4 x float> @llvm.x86.avx512.mask.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> zeroinitializer, i32 5, i8 %x4)
%res2 = call <4 x float> @llvm.x86.avx512.mask.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i32 5, i8 -1)		%res2 = call <4 x float> @llvm.x86.avx512.mask.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i32 5, i8 -1)
%res3 = fadd <4 x float> %res, %res1		%res3 = fadd <4 x float> %res, %res1
%res4 = fadd <4 x float> %res3, %res2		%res4 = fadd <4 x float> %res3, %res2
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <4 x float> @llvm.x86.avx512.maskz.fixupimm.ps.128(<4 x float>, <4 x float>, <4 x i32>, i32, i8)		declare <4 x float> @llvm.x86.avx512.maskz.fixupimm.ps.128(<4 x float>, <4 x float>, <4 x i32>, i32, i8)

define <4 x float>@test_int_x86_avx512_maskz_fixupimm_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i8 %x4) {		define <4 x float>@test_int_x86_avx512_maskz_fixupimm_ps_128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_maskz_fixupimm_ps_128:		; CHECK-LABEL: test_int_x86_avx512_maskz_fixupimm_ps_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xd8]		; CHECK-NEXT: vmovaps %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xd8]
; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0x89,0x54,0xda,0x05]		; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0x89,0x54,0xda,0x05]
; CHECK-NEXT: vmovaps %xmm0, %xmm4 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0xe0]		; CHECK-NEXT: vmovaps %xmm0, %xmm4 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0xe0]
; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm4 ## encoding: [0x62,0xf3,0x75,0x08,0x54,0xe2,0x05]		; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm4 ## encoding: [0x62,0xf3,0x75,0x08,0x54,0xe2,0x05]
; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]		; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0x89,0x54,0xc2,0x05]		; CHECK-NEXT: vfixupimmps $5, %xmm2, %xmm1, %xmm0 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0x89,0x54,0xc2,0x05]
; CHECK-NEXT: vaddps %xmm0, %xmm3, %xmm0 ## encoding: [0x62,0xf1,0x64,0x08,0x58,0xc0]		; CHECK-NEXT: vaddps %xmm0, %xmm3, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe0,0x58,0xc0]
; CHECK-NEXT: vaddps %xmm4, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x58,0xc4]		; CHECK-NEXT: vaddps %xmm4, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x58,0xc4]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x float> @llvm.x86.avx512.maskz.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i32 5, i8 %x4)		%res = call <4 x float> @llvm.x86.avx512.maskz.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i32 5, i8 %x4)
%res1 = call <4 x float> @llvm.x86.avx512.maskz.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> zeroinitializer, i32 5, i8 %x4)		%res1 = call <4 x float> @llvm.x86.avx512.maskz.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> zeroinitializer, i32 5, i8 %x4)
%res2 = call <4 x float> @llvm.x86.avx512.maskz.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i32 5, i8 -1)		%res2 = call <4 x float> @llvm.x86.avx512.maskz.fixupimm.ps.128(<4 x float> %x0, <4 x float> %x1, <4 x i32> %x2, i32 5, i8 -1)
%res3 = fadd <4 x float> %res, %res1		%res3 = fadd <4 x float> %res, %res1
%res4 = fadd <4 x float> %res3, %res2		%res4 = fadd <4 x float> %res3, %res2
ret <4 x float> %res4		ret <4 x float> %res4
}		}

declare <8 x float> @llvm.x86.avx512.mask.fixupimm.ps.256(<8 x float>, <8 x float>, <8 x i32>, i32, i8)		declare <8 x float> @llvm.x86.avx512.mask.fixupimm.ps.256(<8 x float>, <8 x float>, <8 x i32>, i32, i8)

define <8 x float>@test_int_x86_avx512_mask_fixupimm_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i8 %x4) {		define <8 x float>@test_int_x86_avx512_mask_fixupimm_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_mask_fixupimm_ps_256:		; CHECK-LABEL: test_int_x86_avx512_mask_fixupimm_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd8]		; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd8]
; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm3 {%k1} ## encoding: [0x62,0xf3,0x75,0x29,0x54,0xda,0x05]		; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm3 {%k1} ## encoding: [0x62,0xf3,0x75,0x29,0x54,0xda,0x05]
; CHECK-NEXT: vmovaps %ymm0, %ymm4 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xe0]		; CHECK-NEXT: vmovaps %ymm0, %ymm4 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xe0]
; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm4 ## encoding: [0x62,0xf3,0x75,0x28,0x54,0xe2,0x05]		; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm4 ## encoding: [0x62,0xf3,0x75,0x28,0x54,0xe2,0x05]
; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]		; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm0 {%k1} ## encoding: [0x62,0xf3,0x75,0x29,0x54,0xc2,0x05]		; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm0 {%k1} ## encoding: [0x62,0xf3,0x75,0x29,0x54,0xc2,0x05]
; CHECK-NEXT: vaddps %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc0]
; CHECK-NEXT: vaddps %ymm4, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc4]		; CHECK-NEXT: vaddps %ymm4, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc4]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.mask.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i32 5, i8 %x4)		%res = call <8 x float> @llvm.x86.avx512.mask.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i32 5, i8 %x4)
%res1 = call <8 x float> @llvm.x86.avx512.mask.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> zeroinitializer, i32 5, i8 %x4)		%res1 = call <8 x float> @llvm.x86.avx512.mask.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> zeroinitializer, i32 5, i8 %x4)
%res2 = call <8 x float> @llvm.x86.avx512.mask.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i32 5, i8 -1)		%res2 = call <8 x float> @llvm.x86.avx512.mask.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i32 5, i8 -1)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res3, %res2		%res4 = fadd <8 x float> %res3, %res2
ret <8 x float> %res4		ret <8 x float> %res4
}		}

declare <8 x float> @llvm.x86.avx512.maskz.fixupimm.ps.256(<8 x float>, <8 x float>, <8 x i32>, i32, i8)		declare <8 x float> @llvm.x86.avx512.maskz.fixupimm.ps.256(<8 x float>, <8 x float>, <8 x i32>, i32, i8)

define <8 x float>@test_int_x86_avx512_maskz_fixupimm_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i8 %x4) {		define <8 x float>@test_int_x86_avx512_maskz_fixupimm_ps_256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i8 %x4) {
; CHECK-LABEL: test_int_x86_avx512_maskz_fixupimm_ps_256:		; CHECK-LABEL: test_int_x86_avx512_maskz_fixupimm_ps_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xd8]		; CHECK-NEXT: vmovaps %ymm0, %ymm3 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xd8]
; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0xa9,0x54,0xda,0x05]		; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm3 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0xa9,0x54,0xda,0x05]
; CHECK-NEXT: vmovaps %ymm0, %ymm4 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0xe0]		; CHECK-NEXT: vmovaps %ymm0, %ymm4 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0xe0]
; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm4 ## encoding: [0x62,0xf3,0x75,0x28,0x54,0xe2,0x05]		; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm4 ## encoding: [0x62,0xf3,0x75,0x28,0x54,0xe2,0x05]
; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]		; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0xa9,0x54,0xc2,0x05]		; CHECK-NEXT: vfixupimmps $5, %ymm2, %ymm1, %ymm0 {%k1} {z} ## encoding: [0x62,0xf3,0x75,0xa9,0x54,0xc2,0x05]
; CHECK-NEXT: vaddps %ymm0, %ymm3, %ymm0 ## encoding: [0x62,0xf1,0x64,0x28,0x58,0xc0]		; CHECK-NEXT: vaddps %ymm0, %ymm3, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe4,0x58,0xc0]
; CHECK-NEXT: vaddps %ymm4, %ymm0, %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x58,0xc4]		; CHECK-NEXT: vaddps %ymm4, %ymm0, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x58,0xc4]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x float> @llvm.x86.avx512.maskz.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i32 5, i8 %x4)		%res = call <8 x float> @llvm.x86.avx512.maskz.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i32 5, i8 %x4)
%res1 = call <8 x float> @llvm.x86.avx512.maskz.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> zeroinitializer, i32 5, i8 %x4)		%res1 = call <8 x float> @llvm.x86.avx512.maskz.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> zeroinitializer, i32 5, i8 %x4)
%res2 = call <8 x float> @llvm.x86.avx512.maskz.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i32 5, i8 -1)		%res2 = call <8 x float> @llvm.x86.avx512.maskz.fixupimm.ps.256(<8 x float> %x0, <8 x float> %x1, <8 x i32> %x2, i32 5, i8 -1)
%res3 = fadd <8 x float> %res, %res1		%res3 = fadd <8 x float> %res, %res1
%res4 = fadd <8 x float> %res3, %res2		%res4 = fadd <8 x float> %res3, %res2
ret <8 x float> %res4		ret <8 x float> %res4
}		}
▲ Show 20 Lines • Show All 146 Lines • ▼ Show 20 Lines

define <8 x i32>@test_int_x86_avx512_mask_pbroadcast_d_gpr_256(i32 %x0, <8 x i32> %x1, i8 %mask) {		define <8 x i32>@test_int_x86_avx512_mask_pbroadcast_d_gpr_256(i32 %x0, <8 x i32> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_d_gpr_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_d_gpr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpbroadcastd %edi, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7c,0xc7]		; CHECK-NEXT: vpbroadcastd %edi, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x7c,0xc7]
; CHECK-NEXT: vpbroadcastd %edi, %ymm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7c,0xcf]		; CHECK-NEXT: vpbroadcastd %edi, %ymm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0xa9,0x7c,0xcf]
; CHECK-NEXT: vpbroadcastd %edi, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x7c,0xd7]		; CHECK-NEXT: vpbroadcastd %edi, %ymm2 ## encoding: [0x62,0xf2,0x7d,0x28,0x7c,0xd7]
; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0x6d,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xfe,0xc0]
; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0x75,0x28,0xfe,0xc0]		; CHECK-NEXT: vpaddd %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <8 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.256(i32 %x0, <8 x i32> %x1, i8 -1)		%res = call <8 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.256(i32 %x0, <8 x i32> %x1, i8 -1)
%res1 = call <8 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.256(i32 %x0, <8 x i32> %x1, i8 %mask)		%res1 = call <8 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.256(i32 %x0, <8 x i32> %x1, i8 %mask)
%res2 = call <8 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.256(i32 %x0, <8 x i32> zeroinitializer, i8 %mask)		%res2 = call <8 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.256(i32 %x0, <8 x i32> zeroinitializer, i8 %mask)
%res3 = add <8 x i32> %res, %res1		%res3 = add <8 x i32> %res, %res1
%res4 = add <8 x i32> %res2, %res3		%res4 = add <8 x i32> %res2, %res3
ret <8 x i32> %res4		ret <8 x i32> %res4
}		}

declare <4 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.128(i32, <4 x i32>, i8)		declare <4 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.128(i32, <4 x i32>, i8)

define <4 x i32>@test_int_x86_avx512_mask_pbroadcast_d_gpr_128(i32 %x0, <4 x i32> %x1, i8 %mask) {		define <4 x i32>@test_int_x86_avx512_mask_pbroadcast_d_gpr_128(i32 %x0, <4 x i32> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_d_gpr_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_d_gpr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpbroadcastd %edi, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7c,0xc7]		; CHECK-NEXT: vpbroadcastd %edi, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x7c,0xc7]
; CHECK-NEXT: vpbroadcastd %edi, %xmm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7c,0xcf]		; CHECK-NEXT: vpbroadcastd %edi, %xmm1 {%k1} {z} ## encoding: [0x62,0xf2,0x7d,0x89,0x7c,0xcf]
; CHECK-NEXT: vpbroadcastd %edi, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x7c,0xd7]		; CHECK-NEXT: vpbroadcastd %edi, %xmm2 ## encoding: [0x62,0xf2,0x7d,0x08,0x7c,0xd7]
; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0x6d,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xfe,0xc0]
; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x75,0x08,0xfe,0xc0]		; CHECK-NEXT: vpaddd %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xfe,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.128(i32 %x0, <4 x i32> %x1, i8 -1)		%res = call <4 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.128(i32 %x0, <4 x i32> %x1, i8 -1)
%res1 = call <4 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.128(i32 %x0, <4 x i32> %x1, i8 %mask)		%res1 = call <4 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.128(i32 %x0, <4 x i32> %x1, i8 %mask)
%res2 = call <4 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.128(i32 %x0, <4 x i32> zeroinitializer, i8 %mask)		%res2 = call <4 x i32> @llvm.x86.avx512.mask.pbroadcast.d.gpr.128(i32 %x0, <4 x i32> zeroinitializer, i8 %mask)
%res3 = add <4 x i32> %res, %res1		%res3 = add <4 x i32> %res, %res1
%res4 = add <4 x i32> %res2, %res3		%res4 = add <4 x i32> %res2, %res3
ret <4 x i32> %res4		ret <4 x i32> %res4
}		}

declare <4 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.256(i64, <4 x i64>, i8)		declare <4 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.256(i64, <4 x i64>, i8)

define <4 x i64>@test_int_x86_avx512_mask_pbroadcast_q_gpr_256(i64 %x0, <4 x i64> %x1, i8 %mask) {		define <4 x i64>@test_int_x86_avx512_mask_pbroadcast_q_gpr_256(i64 %x0, <4 x i64> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_q_gpr_256:		; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_q_gpr_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpbroadcastq %rdi, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x7c,0xc7]		; CHECK-NEXT: vpbroadcastq %rdi, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x7c,0xc7]
; CHECK-NEXT: vpbroadcastq %rdi, %ymm1 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x7c,0xcf]		; CHECK-NEXT: vpbroadcastq %rdi, %ymm1 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0xa9,0x7c,0xcf]
; CHECK-NEXT: vpbroadcastq %rdi, %ymm2 ## encoding: [0x62,0xf2,0xfd,0x28,0x7c,0xd7]		; CHECK-NEXT: vpbroadcastq %rdi, %ymm2 ## encoding: [0x62,0xf2,0xfd,0x28,0x7c,0xd7]
; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xed,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xd4,0xc0]
; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xf5,0x28,0xd4,0xc0]		; CHECK-NEXT: vpaddq %ymm0, %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.256(i64 %x0, <4 x i64> %x1,i8 -1)		%res = call <4 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.256(i64 %x0, <4 x i64> %x1,i8 -1)
%res1 = call <4 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.256(i64 %x0, <4 x i64> %x1,i8 %mask)		%res1 = call <4 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.256(i64 %x0, <4 x i64> %x1,i8 %mask)
%res2 = call <4 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.256(i64 %x0, <4 x i64> zeroinitializer,i8 %mask)		%res2 = call <4 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.256(i64 %x0, <4 x i64> zeroinitializer,i8 %mask)
%res3 = add <4 x i64> %res, %res1		%res3 = add <4 x i64> %res, %res1
%res4 = add <4 x i64> %res2, %res3		%res4 = add <4 x i64> %res2, %res3
ret <4 x i64> %res4		ret <4 x i64> %res4
}		}

declare <2 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.128(i64, <2 x i64>, i8)		declare <2 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.128(i64, <2 x i64>, i8)

define <2 x i64>@test_int_x86_avx512_mask_pbroadcast_q_gpr_128(i64 %x0, <2 x i64> %x1, i8 %mask) {		define <2 x i64>@test_int_x86_avx512_mask_pbroadcast_q_gpr_128(i64 %x0, <2 x i64> %x1, i8 %mask) {
; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_q_gpr_128:		; CHECK-LABEL: test_int_x86_avx512_mask_pbroadcast_q_gpr_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]		; CHECK-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]
; CHECK-NEXT: vpbroadcastq %rdi, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x7c,0xc7]		; CHECK-NEXT: vpbroadcastq %rdi, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x7c,0xc7]
; CHECK-NEXT: vpbroadcastq %rdi, %xmm1 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x7c,0xcf]		; CHECK-NEXT: vpbroadcastq %rdi, %xmm1 {%k1} {z} ## encoding: [0x62,0xf2,0xfd,0x89,0x7c,0xcf]
; CHECK-NEXT: vpbroadcastq %rdi, %xmm2 ## encoding: [0x62,0xf2,0xfd,0x08,0x7c,0xd7]		; CHECK-NEXT: vpbroadcastq %rdi, %xmm2 ## encoding: [0x62,0xf2,0xfd,0x08,0x7c,0xd7]
; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xed,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xd4,0xc0]
; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xf5,0x08,0xd4,0xc0]		; CHECK-NEXT: vpaddq %xmm0, %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xd4,0xc0]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.128(i64 %x0, <2 x i64> %x1,i8 -1)		%res = call <2 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.128(i64 %x0, <2 x i64> %x1,i8 -1)
%res1 = call <2 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.128(i64 %x0, <2 x i64> %x1,i8 %mask)		%res1 = call <2 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.128(i64 %x0, <2 x i64> %x1,i8 %mask)
%res2 = call <2 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.128(i64 %x0, <2 x i64> zeroinitializer,i8 %mask)		%res2 = call <2 x i64> @llvm.x86.avx512.mask.pbroadcast.q.gpr.128(i64 %x0, <2 x i64> zeroinitializer,i8 %mask)
%res3 = add <2 x i64> %res, %res1		%res3 = add <2 x i64> %res, %res1
%res4 = add <2 x i64> %res2, %res3		%res4 = add <2 x i64> %res2, %res3
ret <2 x i64> %res4		ret <2 x i64> %res4
}		}


define <2 x i64> @test_x86_avx512_psra_q_128(<2 x i64> %a0, <2 x i64> %a1) {		define <2 x i64> @test_x86_avx512_psra_q_128(<2 x i64> %a0, <2 x i64> %a1) {
; CHECK-LABEL: test_x86_avx512_psra_q_128:		; CHECK-LABEL: test_x86_avx512_psra_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xe2,0xc1]		; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xe2,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.psra.q.128(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]		%res = call <2 x i64> @llvm.x86.avx512.psra.q.128(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
ret <2 x i64> %res		ret <2 x i64> %res
}		}
define <2 x i64> @test_x86_avx512_mask_psra_q_128(<2 x i64> %a0, <2 x i64> %a1, <2 x i64> %passthru, i8 %mask) {		define <2 x i64> @test_x86_avx512_mask_psra_q_128(<2 x i64> %a0, <2 x i64> %a1, <2 x i64> %passthru, i8 %mask) {
; CHECK-LABEL: test_x86_avx512_mask_psra_q_128:		; CHECK-LABEL: test_x86_avx512_mask_psra_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xe2,0xd1]		; CHECK-NEXT: vpsraq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x09,0xe2,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.psra.q.128(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]		%res = call <2 x i64> @llvm.x86.avx512.psra.q.128(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <2 x i32> <i32 0, i32 1>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <2 x i32> <i32 0, i32 1>
%res2 = select <2 x i1> %mask.extract, <2 x i64> %res, <2 x i64> %passthru		%res2 = select <2 x i1> %mask.extract, <2 x i64> %res, <2 x i64> %passthru
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}
define <2 x i64> @test_x86_avx512_maskz_psra_q_128(<2 x i64> %a0, <2 x i64> %a1, i8 %mask) {		define <2 x i64> @test_x86_avx512_maskz_psra_q_128(<2 x i64> %a0, <2 x i64> %a1, i8 %mask) {
Show All 19 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.psra.q.256(<4 x i64> %a0, <2 x i64> %a1) ; <<4 x i64>> [#uses=1]		%res = call <4 x i64> @llvm.x86.avx512.psra.q.256(<4 x i64> %a0, <2 x i64> %a1) ; <<4 x i64>> [#uses=1]
ret <4 x i64> %res		ret <4 x i64> %res
}		}
define <4 x i64> @test_x86_avx512_mask_psra_q_256(<4 x i64> %a0, <2 x i64> %a1, <4 x i64> %passthru, i8 %mask) {		define <4 x i64> @test_x86_avx512_mask_psra_q_256(<4 x i64> %a0, <2 x i64> %a1, <4 x i64> %passthru, i8 %mask) {
; CHECK-LABEL: test_x86_avx512_mask_psra_q_256:		; CHECK-LABEL: test_x86_avx512_mask_psra_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsraq %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xe2,0xd1]		; CHECK-NEXT: vpsraq %xmm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf1,0xfd,0x29,0xe2,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.psra.q.256(<4 x i64> %a0, <2 x i64> %a1) ; <<4 x i64>> [#uses=1]		%res = call <4 x i64> @llvm.x86.avx512.psra.q.256(<4 x i64> %a0, <2 x i64> %a1) ; <<4 x i64>> [#uses=1]
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%res2 = select <4 x i1> %mask.extract, <4 x i64> %res, <4 x i64> %passthru		%res2 = select <4 x i1> %mask.extract, <4 x i64> %res, <4 x i64> %passthru
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}
define <4 x i64> @test_x86_avx512_maskz_psra_q_256(<4 x i64> %a0, <2 x i64> %a1, <4 x i64> %passthru, i8 %mask) {		define <4 x i64> @test_x86_avx512_maskz_psra_q_256(<4 x i64> %a0, <2 x i64> %a1, <4 x i64> %passthru, i8 %mask) {
Show All 19 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.psrai.q.128(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]		%res = call <2 x i64> @llvm.x86.avx512.psrai.q.128(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]
ret <2 x i64> %res		ret <2 x i64> %res
}		}
define <2 x i64> @test_x86_avx512_mask_psrai_q_128(<2 x i64> %a0, <2 x i64> %passthru, i8 %mask) {		define <2 x i64> @test_x86_avx512_mask_psrai_q_128(<2 x i64> %a0, <2 x i64> %passthru, i8 %mask) {
; CHECK-LABEL: test_x86_avx512_mask_psrai_q_128:		; CHECK-LABEL: test_x86_avx512_mask_psrai_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsraq $7, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x72,0xe0,0x07]		; CHECK-NEXT: vpsraq $7, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x09,0x72,0xe0,0x07]
; CHECK-NEXT: vmovdqa64 %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.psrai.q.128(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]		%res = call <2 x i64> @llvm.x86.avx512.psrai.q.128(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <2 x i32> <i32 0, i32 1>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <2 x i32> <i32 0, i32 1>
%res2 = select <2 x i1> %mask.extract, <2 x i64> %res, <2 x i64> %passthru		%res2 = select <2 x i1> %mask.extract, <2 x i64> %res, <2 x i64> %passthru
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}
define <2 x i64> @test_x86_avx512_maskz_psrai_q_128(<2 x i64> %a0, i8 %mask) {		define <2 x i64> @test_x86_avx512_maskz_psrai_q_128(<2 x i64> %a0, i8 %mask) {
Show All 19 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.psrai.q.256(<4 x i64> %a0, i32 7) ; <<4 x i64>> [#uses=1]		%res = call <4 x i64> @llvm.x86.avx512.psrai.q.256(<4 x i64> %a0, i32 7) ; <<4 x i64>> [#uses=1]
ret <4 x i64> %res		ret <4 x i64> %res
}		}
define <4 x i64> @test_x86_avx512_mask_psrai_q_256(<4 x i64> %a0, <4 x i64> %passthru, i8 %mask) {		define <4 x i64> @test_x86_avx512_mask_psrai_q_256(<4 x i64> %a0, <4 x i64> %passthru, i8 %mask) {
; CHECK-LABEL: test_x86_avx512_mask_psrai_q_256:		; CHECK-LABEL: test_x86_avx512_mask_psrai_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsraq $7, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x72,0xe0,0x07]		; CHECK-NEXT: vpsraq $7, %ymm0, %ymm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x29,0x72,0xe0,0x07]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc1]		; CHECK-NEXT: vmovdqa %ymm1, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc1]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.psrai.q.256(<4 x i64> %a0, i32 7) ; <<4 x i64>> [#uses=1]		%res = call <4 x i64> @llvm.x86.avx512.psrai.q.256(<4 x i64> %a0, i32 7) ; <<4 x i64>> [#uses=1]
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%res2 = select <4 x i1> %mask.extract, <4 x i64> %res, <4 x i64> %passthru		%res2 = select <4 x i1> %mask.extract, <4 x i64> %res, <4 x i64> %passthru
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}
define <4 x i64> @test_x86_avx512_maskz_psrai_q_256(<4 x i64> %a0, i8 %mask) {		define <4 x i64> @test_x86_avx512_maskz_psrai_q_256(<4 x i64> %a0, i8 %mask) {
Show All 19 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <2 x i64> @test_x86_avx512_mask_psrav_q_128(<2 x i64> %a0, <2 x i64> %a1, <2 x i64> %a2, i8 %mask) {		define <2 x i64> @test_x86_avx512_mask_psrav_q_128(<2 x i64> %a0, <2 x i64> %a1, <2 x i64> %a2, i8 %mask) {
; CHECK-LABEL: test_x86_avx512_mask_psrav_q_128:		; CHECK-LABEL: test_x86_avx512_mask_psrav_q_128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsravq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x46,0xd1]		; CHECK-NEXT: vpsravq %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x46,0xd1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %xmm2, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <2 x i64> @llvm.x86.avx512.psrav.q.128(<2 x i64> %a0, <2 x i64> %a1)		%res = call <2 x i64> @llvm.x86.avx512.psrav.q.128(<2 x i64> %a0, <2 x i64> %a1)
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <2 x i32> <i32 0, i32 1>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <2 x i32> <i32 0, i32 1>
%res2 = select <2 x i1> %mask.extract, <2 x i64> %res, <2 x i64> %a2		%res2 = select <2 x i1> %mask.extract, <2 x i64> %res, <2 x i64> %a2
ret <2 x i64> %res2		ret <2 x i64> %res2
}		}

Show All 21 Lines	; CHECK-NEXT: retq ## encoding: [0xc3]
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @test_x86_avx512_mask_psrav_q_256(<4 x i64> %a0, <4 x i64> %a1, <4 x i64> %a2, i8 %mask) {		define <4 x i64> @test_x86_avx512_mask_psrav_q_256(<4 x i64> %a0, <4 x i64> %a1, <4 x i64> %a2, i8 %mask) {
; CHECK-LABEL: test_x86_avx512_mask_psrav_q_256:		; CHECK-LABEL: test_x86_avx512_mask_psrav_q_256:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]		; CHECK-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]
; CHECK-NEXT: vpsravq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x46,0xd1]		; CHECK-NEXT: vpsravq %ymm1, %ymm0, %ymm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x46,0xd1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0 ## encoding: [0x62,0xf1,0xfd,0x28,0x6f,0xc2]		; CHECK-NEXT: vmovdqa %ymm2, %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfd,0x6f,0xc2]
; CHECK-NEXT: retq ## encoding: [0xc3]		; CHECK-NEXT: retq ## encoding: [0xc3]
%res = call <4 x i64> @llvm.x86.avx512.psrav.q.256(<4 x i64> %a0, <4 x i64> %a1)		%res = call <4 x i64> @llvm.x86.avx512.psrav.q.256(<4 x i64> %a0, <4 x i64> %a1)
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%res2 = select <4 x i1> %mask.extract, <4 x i64> %res, <4 x i64> %a2		%res2 = select <4 x i1> %mask.extract, <4 x i64> %res, <4 x i64> %a2
ret <4 x i64> %res2		ret <4 x i64> %res2
}		}

Show All 14 Lines

llvm/trunk/test/CodeGen/X86/avx512vl-logic.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl \| FileCheck %s --check-prefix=CHECK --check-prefix=KNL		; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl \| FileCheck %s --check-prefix=CHECK --check-prefix=KNL
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=skx -mattr=+avx512vl \| FileCheck %s --check-prefix=CHECK --check-prefix=SKX		; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=skx -mattr=+avx512vl \| FileCheck %s --check-prefix=CHECK --check-prefix=SKX

; 256-bit		; 256-bit

define <8 x i32> @vpandd256(<8 x i32> %a, <8 x i32> %b) nounwind uwtable readnone ssp {		define <8 x i32> @vpandd256(<8 x i32> %a, <8 x i32> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpandd256:		; CHECK-LABEL: vpandd256:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddd {{.*}}(%rip){1to8}, %ymm0, %ymm0		; CHECK-NEXT: vpaddd {{.*}}(%rip){1to8}, %ymm0, %ymm0
; CHECK-NEXT: vpandd %ymm1, %ymm0, %ymm0		; CHECK-NEXT: vpand %ymm1, %ymm0, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <8 x i32> %a, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>		%a2 = add <8 x i32> %a, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
%x = and <8 x i32> %a2, %b		%x = and <8 x i32> %a2, %b
ret <8 x i32> %x		ret <8 x i32> %x
}		}

Show All 10 Lines	entry:
%x = and <8 x i32> %a2, %b2		%x = and <8 x i32> %a2, %b2
ret <8 x i32> %x		ret <8 x i32> %x
}		}

define <8 x i32> @vpord256(<8 x i32> %a, <8 x i32> %b) nounwind uwtable readnone ssp {		define <8 x i32> @vpord256(<8 x i32> %a, <8 x i32> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpord256:		; CHECK-LABEL: vpord256:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddd {{.*}}(%rip){1to8}, %ymm0, %ymm0		; CHECK-NEXT: vpaddd {{.*}}(%rip){1to8}, %ymm0, %ymm0
; CHECK-NEXT: vpord %ymm1, %ymm0, %ymm0		; CHECK-NEXT: vpor %ymm1, %ymm0, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <8 x i32> %a, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>		%a2 = add <8 x i32> %a, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
%x = or <8 x i32> %a2, %b		%x = or <8 x i32> %a2, %b
ret <8 x i32> %x		ret <8 x i32> %x
}		}

define <8 x i32> @vpxord256(<8 x i32> %a, <8 x i32> %b) nounwind uwtable readnone ssp {		define <8 x i32> @vpxord256(<8 x i32> %a, <8 x i32> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpxord256:		; CHECK-LABEL: vpxord256:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddd {{.*}}(%rip){1to8}, %ymm0, %ymm0		; CHECK-NEXT: vpaddd {{.*}}(%rip){1to8}, %ymm0, %ymm0
; CHECK-NEXT: vpxord %ymm1, %ymm0, %ymm0		; CHECK-NEXT: vpxor %ymm1, %ymm0, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <8 x i32> %a, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>		%a2 = add <8 x i32> %a, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
%x = xor <8 x i32> %a2, %b		%x = xor <8 x i32> %a2, %b
ret <8 x i32> %x		ret <8 x i32> %x
}		}

define <4 x i64> @vpandq256(<4 x i64> %a, <4 x i64> %b) nounwind uwtable readnone ssp {		define <4 x i64> @vpandq256(<4 x i64> %a, <4 x i64> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpandq256:		; CHECK-LABEL: vpandq256:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddq {{.*}}(%rip){1to4}, %ymm0, %ymm0		; CHECK-NEXT: vpaddq {{.*}}(%rip){1to4}, %ymm0, %ymm0
; CHECK-NEXT: vpandq %ymm1, %ymm0, %ymm0		; CHECK-NEXT: vpand %ymm1, %ymm0, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <4 x i64> %a, <i64 1, i64 1, i64 1, i64 1>		%a2 = add <4 x i64> %a, <i64 1, i64 1, i64 1, i64 1>
%x = and <4 x i64> %a2, %b		%x = and <4 x i64> %a2, %b
ret <4 x i64> %x		ret <4 x i64> %x
}		}

Show All 10 Lines	entry:
%x = and <4 x i64> %a2, %b2		%x = and <4 x i64> %a2, %b2
ret <4 x i64> %x		ret <4 x i64> %x
}		}

define <4 x i64> @vporq256(<4 x i64> %a, <4 x i64> %b) nounwind uwtable readnone ssp {		define <4 x i64> @vporq256(<4 x i64> %a, <4 x i64> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vporq256:		; CHECK-LABEL: vporq256:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddq {{.*}}(%rip){1to4}, %ymm0, %ymm0		; CHECK-NEXT: vpaddq {{.*}}(%rip){1to4}, %ymm0, %ymm0
; CHECK-NEXT: vporq %ymm1, %ymm0, %ymm0		; CHECK-NEXT: vpor %ymm1, %ymm0, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <4 x i64> %a, <i64 1, i64 1, i64 1, i64 1>		%a2 = add <4 x i64> %a, <i64 1, i64 1, i64 1, i64 1>
%x = or <4 x i64> %a2, %b		%x = or <4 x i64> %a2, %b
ret <4 x i64> %x		ret <4 x i64> %x
}		}

define <4 x i64> @vpxorq256(<4 x i64> %a, <4 x i64> %b) nounwind uwtable readnone ssp {		define <4 x i64> @vpxorq256(<4 x i64> %a, <4 x i64> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpxorq256:		; CHECK-LABEL: vpxorq256:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddq {{.*}}(%rip){1to4}, %ymm0, %ymm0		; CHECK-NEXT: vpaddq {{.*}}(%rip){1to4}, %ymm0, %ymm0
; CHECK-NEXT: vpxorq %ymm1, %ymm0, %ymm0		; CHECK-NEXT: vpxor %ymm1, %ymm0, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <4 x i64> %a, <i64 1, i64 1, i64 1, i64 1>		%a2 = add <4 x i64> %a, <i64 1, i64 1, i64 1, i64 1>
%x = xor <4 x i64> %a2, %b		%x = xor <4 x i64> %a2, %b
ret <4 x i64> %x		ret <4 x i64> %x
}		}

; 128-bit		; 128-bit

define <4 x i32> @vpandd128(<4 x i32> %a, <4 x i32> %b) nounwind uwtable readnone ssp {		define <4 x i32> @vpandd128(<4 x i32> %a, <4 x i32> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpandd128:		; CHECK-LABEL: vpandd128:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddd {{.*}}(%rip){1to4}, %xmm0, %xmm0		; CHECK-NEXT: vpaddd {{.*}}(%rip){1to4}, %xmm0, %xmm0
; CHECK-NEXT: vpandd %xmm1, %xmm0, %xmm0		; CHECK-NEXT: vpand %xmm1, %xmm0, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <4 x i32> %a, <i32 1, i32 1, i32 1, i32 1>		%a2 = add <4 x i32> %a, <i32 1, i32 1, i32 1, i32 1>
%x = and <4 x i32> %a2, %b		%x = and <4 x i32> %a2, %b
ret <4 x i32> %x		ret <4 x i32> %x
}		}

Show All 10 Lines	entry:
%x = and <4 x i32> %a2, %b2		%x = and <4 x i32> %a2, %b2
ret <4 x i32> %x		ret <4 x i32> %x
}		}

define <4 x i32> @vpord128(<4 x i32> %a, <4 x i32> %b) nounwind uwtable readnone ssp {		define <4 x i32> @vpord128(<4 x i32> %a, <4 x i32> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpord128:		; CHECK-LABEL: vpord128:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddd {{.*}}(%rip){1to4}, %xmm0, %xmm0		; CHECK-NEXT: vpaddd {{.*}}(%rip){1to4}, %xmm0, %xmm0
; CHECK-NEXT: vpord %xmm1, %xmm0, %xmm0		; CHECK-NEXT: vpor %xmm1, %xmm0, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <4 x i32> %a, <i32 1, i32 1, i32 1, i32 1>		%a2 = add <4 x i32> %a, <i32 1, i32 1, i32 1, i32 1>
%x = or <4 x i32> %a2, %b		%x = or <4 x i32> %a2, %b
ret <4 x i32> %x		ret <4 x i32> %x
}		}

define <4 x i32> @vpxord128(<4 x i32> %a, <4 x i32> %b) nounwind uwtable readnone ssp {		define <4 x i32> @vpxord128(<4 x i32> %a, <4 x i32> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpxord128:		; CHECK-LABEL: vpxord128:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddd {{.*}}(%rip){1to4}, %xmm0, %xmm0		; CHECK-NEXT: vpaddd {{.*}}(%rip){1to4}, %xmm0, %xmm0
; CHECK-NEXT: vpxord %xmm1, %xmm0, %xmm0		; CHECK-NEXT: vpxor %xmm1, %xmm0, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <4 x i32> %a, <i32 1, i32 1, i32 1, i32 1>		%a2 = add <4 x i32> %a, <i32 1, i32 1, i32 1, i32 1>
%x = xor <4 x i32> %a2, %b		%x = xor <4 x i32> %a2, %b
ret <4 x i32> %x		ret <4 x i32> %x
}		}

define <2 x i64> @vpandq128(<2 x i64> %a, <2 x i64> %b) nounwind uwtable readnone ssp {		define <2 x i64> @vpandq128(<2 x i64> %a, <2 x i64> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpandq128:		; CHECK-LABEL: vpandq128:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddq {{.*}}(%rip), %xmm0, %xmm0		; CHECK-NEXT: vpaddq {{.*}}(%rip), %xmm0, %xmm0
; CHECK-NEXT: vpandq %xmm1, %xmm0, %xmm0		; CHECK-NEXT: vpand %xmm1, %xmm0, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <2 x i64> %a, <i64 1, i64 1>		%a2 = add <2 x i64> %a, <i64 1, i64 1>
%x = and <2 x i64> %a2, %b		%x = and <2 x i64> %a2, %b
ret <2 x i64> %x		ret <2 x i64> %x
}		}

Show All 10 Lines	entry:
%x = and <2 x i64> %a2, %b2		%x = and <2 x i64> %a2, %b2
ret <2 x i64> %x		ret <2 x i64> %x
}		}

define <2 x i64> @vporq128(<2 x i64> %a, <2 x i64> %b) nounwind uwtable readnone ssp {		define <2 x i64> @vporq128(<2 x i64> %a, <2 x i64> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vporq128:		; CHECK-LABEL: vporq128:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddq {{.*}}(%rip), %xmm0, %xmm0		; CHECK-NEXT: vpaddq {{.*}}(%rip), %xmm0, %xmm0
; CHECK-NEXT: vporq %xmm1, %xmm0, %xmm0		; CHECK-NEXT: vpor %xmm1, %xmm0, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <2 x i64> %a, <i64 1, i64 1>		%a2 = add <2 x i64> %a, <i64 1, i64 1>
%x = or <2 x i64> %a2, %b		%x = or <2 x i64> %a2, %b
ret <2 x i64> %x		ret <2 x i64> %x
}		}

define <2 x i64> @vpxorq128(<2 x i64> %a, <2 x i64> %b) nounwind uwtable readnone ssp {		define <2 x i64> @vpxorq128(<2 x i64> %a, <2 x i64> %b) nounwind uwtable readnone ssp {
; CHECK-LABEL: vpxorq128:		; CHECK-LABEL: vpxorq128:
; CHECK: ## BB#0: ## %entry		; CHECK: ## BB#0: ## %entry
; CHECK-NEXT: vpaddq {{.*}}(%rip), %xmm0, %xmm0		; CHECK-NEXT: vpaddq {{.*}}(%rip), %xmm0, %xmm0
; CHECK-NEXT: vpxorq %xmm1, %xmm0, %xmm0		; CHECK-NEXT: vpxor %xmm1, %xmm0, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
entry:		entry:
; Force the execution domain with an add.		; Force the execution domain with an add.
%a2 = add <2 x i64> %a, <i64 1, i64 1>		%a2 = add <2 x i64> %a, <i64 1, i64 1>
%x = xor <2 x i64> %a2, %b		%x = xor <2 x i64> %a2, %b
ret <2 x i64> %x		ret <2 x i64> %x
}		}

▲ Show 20 Lines • Show All 736 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512vl-mov.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -march=x86-64 -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl --show-mc-encoding\| FileCheck %s			; RUN: llc < %s -march=x86-64 -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512vl --show-mc-encoding\| FileCheck %s

	define <8 x i32> @test_256_1(i8 * %addr) {			define <8 x i32> @test_256_1(i8 * %addr) {
	; CHECK-LABEL: test_256_1:			; CHECK-LABEL: test_256_1:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x i32>*			%vaddr = bitcast i8* %addr to <8 x i32>*
	%res = load <8 x i32>, <8 x i32>* %vaddr, align 1			%res = load <8 x i32>, <8 x i32>* %vaddr, align 1
	ret <8 x i32>%res			ret <8 x i32>%res
	}			}

	define <8 x i32> @test_256_2(i8 * %addr) {			define <8 x i32> @test_256_2(i8 * %addr) {
	; CHECK-LABEL: test_256_2:			; CHECK-LABEL: test_256_2:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x i32>*			%vaddr = bitcast i8* %addr to <8 x i32>*
	%res = load <8 x i32>, <8 x i32>* %vaddr, align 32			%res = load <8 x i32>, <8 x i32>* %vaddr, align 32
	ret <8 x i32>%res			ret <8 x i32>%res
	}			}

	define void @test_256_3(i8 * %addr, <4 x i64> %data) {			define void @test_256_3(i8 * %addr, <4 x i64> %data) {
	; CHECK-LABEL: test_256_3:			; CHECK-LABEL: test_256_3:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x28,0x29,0x07]			; CHECK-NEXT: vmovaps %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x29,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x i64>*			%vaddr = bitcast i8* %addr to <4 x i64>*
	store <4 x i64>%data, <4 x i64>* %vaddr, align 32			store <4 x i64>%data, <4 x i64>* %vaddr, align 32
	ret void			ret void
	}			}

	define void @test_256_4(i8 * %addr, <8 x i32> %data) {			define void @test_256_4(i8 * %addr, <8 x i32> %data) {
	; CHECK-LABEL: test_256_4:			; CHECK-LABEL: test_256_4:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x28,0x11,0x07]			; CHECK-NEXT: vmovups %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x i32>*			%vaddr = bitcast i8* %addr to <8 x i32>*
	store <8 x i32>%data, <8 x i32>* %vaddr, align 1			store <8 x i32>%data, <8 x i32>* %vaddr, align 1
	ret void			ret void
	}			}

	define void @test_256_5(i8 * %addr, <8 x i32> %data) {			define void @test_256_5(i8 * %addr, <8 x i32> %data) {
	; CHECK-LABEL: test_256_5:			; CHECK-LABEL: test_256_5:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x28,0x29,0x07]			; CHECK-NEXT: vmovaps %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x29,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x i32>*			%vaddr = bitcast i8* %addr to <8 x i32>*
	store <8 x i32>%data, <8 x i32>* %vaddr, align 32			store <8 x i32>%data, <8 x i32>* %vaddr, align 32
	ret void			ret void
	}			}

	define <4 x i64> @test_256_6(i8 * %addr) {			define <4 x i64> @test_256_6(i8 * %addr) {
	; CHECK-LABEL: test_256_6:			; CHECK-LABEL: test_256_6:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x i64>*			%vaddr = bitcast i8* %addr to <4 x i64>*
	%res = load <4 x i64>, <4 x i64>* %vaddr, align 32			%res = load <4 x i64>, <4 x i64>* %vaddr, align 32
	ret <4 x i64>%res			ret <4 x i64>%res
	}			}

	define void @test_256_7(i8 * %addr, <4 x i64> %data) {			define void @test_256_7(i8 * %addr, <4 x i64> %data) {
	; CHECK-LABEL: test_256_7:			; CHECK-LABEL: test_256_7:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x28,0x11,0x07]			; CHECK-NEXT: vmovups %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x i64>*			%vaddr = bitcast i8* %addr to <4 x i64>*
	store <4 x i64>%data, <4 x i64>* %vaddr, align 1			store <4 x i64>%data, <4 x i64>* %vaddr, align 1
	ret void			ret void
	}			}

	define <4 x i64> @test_256_8(i8 * %addr) {			define <4 x i64> @test_256_8(i8 * %addr) {
	; CHECK-LABEL: test_256_8:			; CHECK-LABEL: test_256_8:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x i64>*			%vaddr = bitcast i8* %addr to <4 x i64>*
	%res = load <4 x i64>, <4 x i64>* %vaddr, align 1			%res = load <4 x i64>, <4 x i64>* %vaddr, align 1
	ret <4 x i64>%res			ret <4 x i64>%res
	}			}

	define void @test_256_9(i8 * %addr, <4 x double> %data) {			define void @test_256_9(i8 * %addr, <4 x double> %data) {
	; CHECK-LABEL: test_256_9:			; CHECK-LABEL: test_256_9:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x28,0x29,0x07]			; CHECK-NEXT: vmovaps %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x29,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x double>*			%vaddr = bitcast i8* %addr to <4 x double>*
	store <4 x double>%data, <4 x double>* %vaddr, align 32			store <4 x double>%data, <4 x double>* %vaddr, align 32
	ret void			ret void
	}			}

	define <4 x double> @test_256_10(i8 * %addr) {			define <4 x double> @test_256_10(i8 * %addr) {
	; CHECK-LABEL: test_256_10:			; CHECK-LABEL: test_256_10:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x double>*			%vaddr = bitcast i8* %addr to <4 x double>*
	%res = load <4 x double>, <4 x double>* %vaddr, align 32			%res = load <4 x double>, <4 x double>* %vaddr, align 32
	ret <4 x double>%res			ret <4 x double>%res
	}			}

	define void @test_256_11(i8 * %addr, <8 x float> %data) {			define void @test_256_11(i8 * %addr, <8 x float> %data) {
	; CHECK-LABEL: test_256_11:			; CHECK-LABEL: test_256_11:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x28,0x29,0x07]			; CHECK-NEXT: vmovaps %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x29,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x float>*			%vaddr = bitcast i8* %addr to <8 x float>*
	store <8 x float>%data, <8 x float>* %vaddr, align 32			store <8 x float>%data, <8 x float>* %vaddr, align 32
	ret void			ret void
	}			}

	define <8 x float> @test_256_12(i8 * %addr) {			define <8 x float> @test_256_12(i8 * %addr) {
	; CHECK-LABEL: test_256_12:			; CHECK-LABEL: test_256_12:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x float>*			%vaddr = bitcast i8* %addr to <8 x float>*
	%res = load <8 x float>, <8 x float>* %vaddr, align 32			%res = load <8 x float>, <8 x float>* %vaddr, align 32
	ret <8 x float>%res			ret <8 x float>%res
	}			}

	define void @test_256_13(i8 * %addr, <4 x double> %data) {			define void @test_256_13(i8 * %addr, <4 x double> %data) {
	; CHECK-LABEL: test_256_13:			; CHECK-LABEL: test_256_13:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x28,0x11,0x07]			; CHECK-NEXT: vmovups %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x double>*			%vaddr = bitcast i8* %addr to <4 x double>*
	store <4 x double>%data, <4 x double>* %vaddr, align 1			store <4 x double>%data, <4 x double>* %vaddr, align 1
	ret void			ret void
	}			}

	define <4 x double> @test_256_14(i8 * %addr) {			define <4 x double> @test_256_14(i8 * %addr) {
	; CHECK-LABEL: test_256_14:			; CHECK-LABEL: test_256_14:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x double>*			%vaddr = bitcast i8* %addr to <4 x double>*
	%res = load <4 x double>, <4 x double>* %vaddr, align 1			%res = load <4 x double>, <4 x double>* %vaddr, align 1
	ret <4 x double>%res			ret <4 x double>%res
	}			}

	define void @test_256_15(i8 * %addr, <8 x float> %data) {			define void @test_256_15(i8 * %addr, <8 x float> %data) {
	; CHECK-LABEL: test_256_15:			; CHECK-LABEL: test_256_15:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups %ymm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x28,0x11,0x07]			; CHECK-NEXT: vmovups %ymm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x float>*			%vaddr = bitcast i8* %addr to <8 x float>*
	store <8 x float>%data, <8 x float>* %vaddr, align 1			store <8 x float>%data, <8 x float>* %vaddr, align 1
	ret void			ret void
	}			}

	define <8 x float> @test_256_16(i8 * %addr) {			define <8 x float> @test_256_16(i8 * %addr) {
	; CHECK-LABEL: test_256_16:			; CHECK-LABEL: test_256_16:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups (%rdi), %ymm0 ## encoding: [0x62,0xf1,0x7c,0x28,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %ymm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfc,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <8 x float>*			%vaddr = bitcast i8* %addr to <8 x float>*
	%res = load <8 x float>, <8 x float>* %vaddr, align 1			%res = load <8 x float>, <8 x float>* %vaddr, align 1
	ret <8 x float>%res			ret <8 x float>%res
	}			}

	define <8 x i32> @test_256_17(i8 * %addr, <8 x i32> %old, <8 x i32> %mask1) {			define <8 x i32> @test_256_17(i8 * %addr, <8 x i32> %old, <8 x i32> %mask1) {
	; CHECK-LABEL: test_256_17:			; CHECK-LABEL: test_256_17:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqd %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0x75,0x28,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqd %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0x75,0x28,0x1f,0xca,0x04]
	; CHECK-NEXT: vpblendmd (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x64,0x07]			; CHECK-NEXT: vpblendmd (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x64,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <8 x i32> %mask1, zeroinitializer			%mask = icmp ne <8 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x i32>*			%vaddr = bitcast i8* %addr to <8 x i32>*
	%r = load <8 x i32>, <8 x i32>* %vaddr, align 32			%r = load <8 x i32>, <8 x i32>* %vaddr, align 32
	%res = select <8 x i1> %mask, <8 x i32> %r, <8 x i32> %old			%res = select <8 x i1> %mask, <8 x i32> %r, <8 x i32> %old
	ret <8 x i32>%res			ret <8 x i32>%res
	}			}

	define <8 x i32> @test_256_18(i8 * %addr, <8 x i32> %old, <8 x i32> %mask1) {			define <8 x i32> @test_256_18(i8 * %addr, <8 x i32> %old, <8 x i32> %mask1) {
	; CHECK-LABEL: test_256_18:			; CHECK-LABEL: test_256_18:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqd %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0x75,0x28,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqd %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0x75,0x28,0x1f,0xca,0x04]
	; CHECK-NEXT: vpblendmd (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x64,0x07]			; CHECK-NEXT: vpblendmd (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x64,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <8 x i32> %mask1, zeroinitializer			%mask = icmp ne <8 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x i32>*			%vaddr = bitcast i8* %addr to <8 x i32>*
	%r = load <8 x i32>, <8 x i32>* %vaddr, align 1			%r = load <8 x i32>, <8 x i32>* %vaddr, align 1
	%res = select <8 x i1> %mask, <8 x i32> %r, <8 x i32> %old			%res = select <8 x i1> %mask, <8 x i32> %r, <8 x i32> %old
	ret <8 x i32>%res			ret <8 x i32>%res
	}			}

	define <8 x i32> @test_256_19(i8 * %addr, <8 x i32> %mask1) {			define <8 x i32> @test_256_19(i8 * %addr, <8 x i32> %mask1) {
	; CHECK-LABEL: test_256_19:			; CHECK-LABEL: test_256_19:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqd %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x28,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqd %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x28,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovdqa32 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x6f,0x07]			; CHECK-NEXT: vmovdqa32 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0xa9,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <8 x i32> %mask1, zeroinitializer			%mask = icmp ne <8 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x i32>*			%vaddr = bitcast i8* %addr to <8 x i32>*
	%r = load <8 x i32>, <8 x i32>* %vaddr, align 32			%r = load <8 x i32>, <8 x i32>* %vaddr, align 32
	%res = select <8 x i1> %mask, <8 x i32> %r, <8 x i32> zeroinitializer			%res = select <8 x i1> %mask, <8 x i32> %r, <8 x i32> zeroinitializer
	ret <8 x i32>%res			ret <8 x i32>%res
	}			}

	define <8 x i32> @test_256_20(i8 * %addr, <8 x i32> %mask1) {			define <8 x i32> @test_256_20(i8 * %addr, <8 x i32> %mask1) {
	; CHECK-LABEL: test_256_20:			; CHECK-LABEL: test_256_20:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqd %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x28,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqd %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x28,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovdqu32 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x6f,0x07]			; CHECK-NEXT: vmovdqu32 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0xa9,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <8 x i32> %mask1, zeroinitializer			%mask = icmp ne <8 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x i32>*			%vaddr = bitcast i8* %addr to <8 x i32>*
	%r = load <8 x i32>, <8 x i32>* %vaddr, align 1			%r = load <8 x i32>, <8 x i32>* %vaddr, align 1
	%res = select <8 x i1> %mask, <8 x i32> %r, <8 x i32> zeroinitializer			%res = select <8 x i1> %mask, <8 x i32> %r, <8 x i32> zeroinitializer
	ret <8 x i32>%res			ret <8 x i32>%res
	}			}

	define <4 x i64> @test_256_21(i8 * %addr, <4 x i64> %old, <4 x i64> %mask1) {			define <4 x i64> @test_256_21(i8 * %addr, <4 x i64> %old, <4 x i64> %mask1) {
	; CHECK-LABEL: test_256_21:			; CHECK-LABEL: test_256_21:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqq %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqq %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x1f,0xca,0x04]
	; CHECK-NEXT: vpblendmq (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x64,0x07]			; CHECK-NEXT: vpblendmq (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x64,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i64> %mask1, zeroinitializer			%mask = icmp ne <4 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x i64>*			%vaddr = bitcast i8* %addr to <4 x i64>*
	%r = load <4 x i64>, <4 x i64>* %vaddr, align 32			%r = load <4 x i64>, <4 x i64>* %vaddr, align 32
	%res = select <4 x i1> %mask, <4 x i64> %r, <4 x i64> %old			%res = select <4 x i1> %mask, <4 x i64> %r, <4 x i64> %old
	ret <4 x i64>%res			ret <4 x i64>%res
	}			}

	define <4 x i64> @test_256_22(i8 * %addr, <4 x i64> %old, <4 x i64> %mask1) {			define <4 x i64> @test_256_22(i8 * %addr, <4 x i64> %old, <4 x i64> %mask1) {
	; CHECK-LABEL: test_256_22:			; CHECK-LABEL: test_256_22:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqq %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqq %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x1f,0xca,0x04]
	; CHECK-NEXT: vpblendmq (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x64,0x07]			; CHECK-NEXT: vpblendmq (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x64,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i64> %mask1, zeroinitializer			%mask = icmp ne <4 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x i64>*			%vaddr = bitcast i8* %addr to <4 x i64>*
	%r = load <4 x i64>, <4 x i64>* %vaddr, align 1			%r = load <4 x i64>, <4 x i64>* %vaddr, align 1
	%res = select <4 x i1> %mask, <4 x i64> %r, <4 x i64> %old			%res = select <4 x i1> %mask, <4 x i64> %r, <4 x i64> %old
	ret <4 x i64>%res			ret <4 x i64>%res
	}			}

	define <4 x i64> @test_256_23(i8 * %addr, <4 x i64> %mask1) {			define <4 x i64> @test_256_23(i8 * %addr, <4 x i64> %mask1) {
	; CHECK-LABEL: test_256_23:			; CHECK-LABEL: test_256_23:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqq %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqq %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovdqa64 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x6f,0x07]			; CHECK-NEXT: vmovdqa64 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i64> %mask1, zeroinitializer			%mask = icmp ne <4 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x i64>*			%vaddr = bitcast i8* %addr to <4 x i64>*
	%r = load <4 x i64>, <4 x i64>* %vaddr, align 32			%r = load <4 x i64>, <4 x i64>* %vaddr, align 32
	%res = select <4 x i1> %mask, <4 x i64> %r, <4 x i64> zeroinitializer			%res = select <4 x i1> %mask, <4 x i64> %r, <4 x i64> zeroinitializer
	ret <4 x i64>%res			ret <4 x i64>%res
	}			}

	define <4 x i64> @test_256_24(i8 * %addr, <4 x i64> %mask1) {			define <4 x i64> @test_256_24(i8 * %addr, <4 x i64> %mask1) {
	; CHECK-LABEL: test_256_24:			; CHECK-LABEL: test_256_24:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqq %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqq %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovdqu64 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfe,0xa9,0x6f,0x07]			; CHECK-NEXT: vmovdqu64 (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfe,0xa9,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i64> %mask1, zeroinitializer			%mask = icmp ne <4 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x i64>*			%vaddr = bitcast i8* %addr to <4 x i64>*
	%r = load <4 x i64>, <4 x i64>* %vaddr, align 1			%r = load <4 x i64>, <4 x i64>* %vaddr, align 1
	%res = select <4 x i1> %mask, <4 x i64> %r, <4 x i64> zeroinitializer			%res = select <4 x i1> %mask, <4 x i64> %r, <4 x i64> zeroinitializer
	ret <4 x i64>%res			ret <4 x i64>%res
	}			}

	define <8 x float> @test_256_25(i8 * %addr, <8 x float> %old, <8 x float> %mask1) {			define <8 x float> @test_256_25(i8 * %addr, <8 x float> %old, <8 x float> %mask1) {
	; CHECK-LABEL: test_256_25:			; CHECK-LABEL: test_256_25:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vcmpordps %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf1,0x74,0x28,0xc2,0xca,0x07]			; CHECK-NEXT: vcmpordps %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf1,0x74,0x28,0xc2,0xca,0x07]
	; CHECK-NEXT: vcmpneqps %ymm2, %ymm1, %k1 {%k1} ## encoding: [0x62,0xf1,0x74,0x29,0xc2,0xca,0x04]			; CHECK-NEXT: vcmpneqps %ymm2, %ymm1, %k1 {%k1} ## encoding: [0x62,0xf1,0x74,0x29,0xc2,0xca,0x04]
	; CHECK-NEXT: vblendmps (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x65,0x07]			; CHECK-NEXT: vblendmps (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x65,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = fcmp one <8 x float> %mask1, zeroinitializer			%mask = fcmp one <8 x float> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x float>*			%vaddr = bitcast i8* %addr to <8 x float>*
	%r = load <8 x float>, <8 x float>* %vaddr, align 32			%r = load <8 x float>, <8 x float>* %vaddr, align 32
	%res = select <8 x i1> %mask, <8 x float> %r, <8 x float> %old			%res = select <8 x i1> %mask, <8 x float> %r, <8 x float> %old
	ret <8 x float>%res			ret <8 x float>%res
	}			}

	define <8 x float> @test_256_26(i8 * %addr, <8 x float> %old, <8 x float> %mask1) {			define <8 x float> @test_256_26(i8 * %addr, <8 x float> %old, <8 x float> %mask1) {
	; CHECK-LABEL: test_256_26:			; CHECK-LABEL: test_256_26:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vcmpordps %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf1,0x74,0x28,0xc2,0xca,0x07]			; CHECK-NEXT: vcmpordps %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf1,0x74,0x28,0xc2,0xca,0x07]
	; CHECK-NEXT: vcmpneqps %ymm2, %ymm1, %k1 {%k1} ## encoding: [0x62,0xf1,0x74,0x29,0xc2,0xca,0x04]			; CHECK-NEXT: vcmpneqps %ymm2, %ymm1, %k1 {%k1} ## encoding: [0x62,0xf1,0x74,0x29,0xc2,0xca,0x04]
	; CHECK-NEXT: vblendmps (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x65,0x07]			; CHECK-NEXT: vblendmps (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x29,0x65,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = fcmp one <8 x float> %mask1, zeroinitializer			%mask = fcmp one <8 x float> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x float>*			%vaddr = bitcast i8* %addr to <8 x float>*
	%r = load <8 x float>, <8 x float>* %vaddr, align 1			%r = load <8 x float>, <8 x float>* %vaddr, align 1
	%res = select <8 x i1> %mask, <8 x float> %r, <8 x float> %old			%res = select <8 x i1> %mask, <8 x float> %r, <8 x float> %old
	ret <8 x float>%res			ret <8 x float>%res
	}			}

	define <8 x float> @test_256_27(i8 * %addr, <8 x float> %mask1) {			define <8 x float> @test_256_27(i8 * %addr, <8 x float> %mask1) {
	; CHECK-LABEL: test_256_27:			; CHECK-LABEL: test_256_27:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vcmpordps %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf1,0x7c,0x28,0xc2,0xc9,0x07]			; CHECK-NEXT: vcmpordps %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf1,0x7c,0x28,0xc2,0xc9,0x07]
	; CHECK-NEXT: vcmpneqps %ymm1, %ymm0, %k1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0xc2,0xc9,0x04]			; CHECK-NEXT: vcmpneqps %ymm1, %ymm0, %k1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0xc2,0xc9,0x04]
	; CHECK-NEXT: vmovaps (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = fcmp one <8 x float> %mask1, zeroinitializer			%mask = fcmp one <8 x float> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x float>*			%vaddr = bitcast i8* %addr to <8 x float>*
	%r = load <8 x float>, <8 x float>* %vaddr, align 32			%r = load <8 x float>, <8 x float>* %vaddr, align 32
	%res = select <8 x i1> %mask, <8 x float> %r, <8 x float> zeroinitializer			%res = select <8 x i1> %mask, <8 x float> %r, <8 x float> zeroinitializer
	ret <8 x float>%res			ret <8 x float>%res
	}			}

	define <8 x float> @test_256_28(i8 * %addr, <8 x float> %mask1) {			define <8 x float> @test_256_28(i8 * %addr, <8 x float> %mask1) {
	; CHECK-LABEL: test_256_28:			; CHECK-LABEL: test_256_28:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vcmpordps %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf1,0x7c,0x28,0xc2,0xc9,0x07]			; CHECK-NEXT: vcmpordps %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf1,0x7c,0x28,0xc2,0xc9,0x07]
	; CHECK-NEXT: vcmpneqps %ymm1, %ymm0, %k1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0xc2,0xc9,0x04]			; CHECK-NEXT: vcmpneqps %ymm1, %ymm0, %k1 {%k1} ## encoding: [0x62,0xf1,0x7c,0x29,0xc2,0xc9,0x04]
	; CHECK-NEXT: vmovups (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0xa9,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = fcmp one <8 x float> %mask1, zeroinitializer			%mask = fcmp one <8 x float> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <8 x float>*			%vaddr = bitcast i8* %addr to <8 x float>*
	%r = load <8 x float>, <8 x float>* %vaddr, align 1			%r = load <8 x float>, <8 x float>* %vaddr, align 1
	%res = select <8 x i1> %mask, <8 x float> %r, <8 x float> zeroinitializer			%res = select <8 x i1> %mask, <8 x float> %r, <8 x float> zeroinitializer
	ret <8 x float>%res			ret <8 x float>%res
	}			}

	define <4 x double> @test_256_29(i8 * %addr, <4 x double> %old, <4 x i64> %mask1) {			define <4 x double> @test_256_29(i8 * %addr, <4 x double> %old, <4 x i64> %mask1) {
	; CHECK-LABEL: test_256_29:			; CHECK-LABEL: test_256_29:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqq %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqq %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x1f,0xca,0x04]
	; CHECK-NEXT: vblendmpd (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x65,0x07]			; CHECK-NEXT: vblendmpd (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x65,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i64> %mask1, zeroinitializer			%mask = icmp ne <4 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x double>*			%vaddr = bitcast i8* %addr to <4 x double>*
	%r = load <4 x double>, <4 x double>* %vaddr, align 32			%r = load <4 x double>, <4 x double>* %vaddr, align 32
	%res = select <4 x i1> %mask, <4 x double> %r, <4 x double> %old			%res = select <4 x i1> %mask, <4 x double> %r, <4 x double> %old
	ret <4 x double>%res			ret <4 x double>%res
	}			}

	define <4 x double> @test_256_30(i8 * %addr, <4 x double> %old, <4 x i64> %mask1) {			define <4 x double> @test_256_30(i8 * %addr, <4 x double> %old, <4 x i64> %mask1) {
	; CHECK-LABEL: test_256_30:			; CHECK-LABEL: test_256_30:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2 ## encoding: [0x62,0xf1,0x6d,0x28,0xef,0xd2]			; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2 ## EVEX TO VEX Compression encoding: [0xc5,0xed,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqq %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqq %ymm2, %ymm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x28,0x1f,0xca,0x04]
	; CHECK-NEXT: vblendmpd (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x65,0x07]			; CHECK-NEXT: vblendmpd (%rdi), %ymm0, %ymm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x29,0x65,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i64> %mask1, zeroinitializer			%mask = icmp ne <4 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x double>*			%vaddr = bitcast i8* %addr to <4 x double>*
	%r = load <4 x double>, <4 x double>* %vaddr, align 1			%r = load <4 x double>, <4 x double>* %vaddr, align 1
	%res = select <4 x i1> %mask, <4 x double> %r, <4 x double> %old			%res = select <4 x i1> %mask, <4 x double> %r, <4 x double> %old
	ret <4 x double>%res			ret <4 x double>%res
	}			}

	define <4 x double> @test_256_31(i8 * %addr, <4 x i64> %mask1) {			define <4 x double> @test_256_31(i8 * %addr, <4 x i64> %mask1) {
	; CHECK-LABEL: test_256_31:			; CHECK-LABEL: test_256_31:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqq %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqq %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovapd (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x28,0x07]			; CHECK-NEXT: vmovapd (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i64> %mask1, zeroinitializer			%mask = icmp ne <4 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x double>*			%vaddr = bitcast i8* %addr to <4 x double>*
	%r = load <4 x double>, <4 x double>* %vaddr, align 32			%r = load <4 x double>, <4 x double>* %vaddr, align 32
	%res = select <4 x i1> %mask, <4 x double> %r, <4 x double> zeroinitializer			%res = select <4 x i1> %mask, <4 x double> %r, <4 x double> zeroinitializer
	ret <4 x double>%res			ret <4 x double>%res
	}			}

	define <4 x double> @test_256_32(i8 * %addr, <4 x i64> %mask1) {			define <4 x double> @test_256_32(i8 * %addr, <4 x i64> %mask1) {
	; CHECK-LABEL: test_256_32:			; CHECK-LABEL: test_256_32:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %ymm1, %ymm1, %ymm1 ## encoding: [0x62,0xf1,0x75,0x28,0xef,0xc9]			; CHECK-NEXT: vpxor %ymm1, %ymm1, %ymm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf5,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqq %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqq %ymm1, %ymm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x28,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovupd (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x10,0x07]			; CHECK-NEXT: vmovupd (%rdi), %ymm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0xa9,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i64> %mask1, zeroinitializer			%mask = icmp ne <4 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x double>*			%vaddr = bitcast i8* %addr to <4 x double>*
	%r = load <4 x double>, <4 x double>* %vaddr, align 1			%r = load <4 x double>, <4 x double>* %vaddr, align 1
	%res = select <4 x i1> %mask, <4 x double> %r, <4 x double> zeroinitializer			%res = select <4 x i1> %mask, <4 x double> %r, <4 x double> zeroinitializer
	ret <4 x double>%res			ret <4 x double>%res
	}			}

	define <4 x i32> @test_128_1(i8 * %addr) {			define <4 x i32> @test_128_1(i8 * %addr) {
	; CHECK-LABEL: test_128_1:			; CHECK-LABEL: test_128_1:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x i32>*			%vaddr = bitcast i8* %addr to <4 x i32>*
	%res = load <4 x i32>, <4 x i32>* %vaddr, align 1			%res = load <4 x i32>, <4 x i32>* %vaddr, align 1
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <4 x i32> @test_128_2(i8 * %addr) {			define <4 x i32> @test_128_2(i8 * %addr) {
	; CHECK-LABEL: test_128_2:			; CHECK-LABEL: test_128_2:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x i32>*			%vaddr = bitcast i8* %addr to <4 x i32>*
	%res = load <4 x i32>, <4 x i32>* %vaddr, align 16			%res = load <4 x i32>, <4 x i32>* %vaddr, align 16
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define void @test_128_3(i8 * %addr, <2 x i64> %data) {			define void @test_128_3(i8 * %addr, <2 x i64> %data) {
	; CHECK-LABEL: test_128_3:			; CHECK-LABEL: test_128_3:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x08,0x29,0x07]			; CHECK-NEXT: vmovaps %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x29,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <2 x i64>*			%vaddr = bitcast i8* %addr to <2 x i64>*
	store <2 x i64>%data, <2 x i64>* %vaddr, align 16			store <2 x i64>%data, <2 x i64>* %vaddr, align 16
	ret void			ret void
	}			}

	define void @test_128_4(i8 * %addr, <4 x i32> %data) {			define void @test_128_4(i8 * %addr, <4 x i32> %data) {
	; CHECK-LABEL: test_128_4:			; CHECK-LABEL: test_128_4:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x08,0x11,0x07]			; CHECK-NEXT: vmovups %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x i32>*			%vaddr = bitcast i8* %addr to <4 x i32>*
	store <4 x i32>%data, <4 x i32>* %vaddr, align 1			store <4 x i32>%data, <4 x i32>* %vaddr, align 1
	ret void			ret void
	}			}

	define void @test_128_5(i8 * %addr, <4 x i32> %data) {			define void @test_128_5(i8 * %addr, <4 x i32> %data) {
	; CHECK-LABEL: test_128_5:			; CHECK-LABEL: test_128_5:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x08,0x29,0x07]			; CHECK-NEXT: vmovaps %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x29,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x i32>*			%vaddr = bitcast i8* %addr to <4 x i32>*
	store <4 x i32>%data, <4 x i32>* %vaddr, align 16			store <4 x i32>%data, <4 x i32>* %vaddr, align 16
	ret void			ret void
	}			}

	define <2 x i64> @test_128_6(i8 * %addr) {			define <2 x i64> @test_128_6(i8 * %addr) {
	; CHECK-LABEL: test_128_6:			; CHECK-LABEL: test_128_6:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <2 x i64>*			%vaddr = bitcast i8* %addr to <2 x i64>*
	%res = load <2 x i64>, <2 x i64>* %vaddr, align 16			%res = load <2 x i64>, <2 x i64>* %vaddr, align 16
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define void @test_128_7(i8 * %addr, <2 x i64> %data) {			define void @test_128_7(i8 * %addr, <2 x i64> %data) {
	; CHECK-LABEL: test_128_7:			; CHECK-LABEL: test_128_7:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x08,0x11,0x07]			; CHECK-NEXT: vmovups %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <2 x i64>*			%vaddr = bitcast i8* %addr to <2 x i64>*
	store <2 x i64>%data, <2 x i64>* %vaddr, align 1			store <2 x i64>%data, <2 x i64>* %vaddr, align 1
	ret void			ret void
	}			}

	define <2 x i64> @test_128_8(i8 * %addr) {			define <2 x i64> @test_128_8(i8 * %addr) {
	; CHECK-LABEL: test_128_8:			; CHECK-LABEL: test_128_8:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <2 x i64>*			%vaddr = bitcast i8* %addr to <2 x i64>*
	%res = load <2 x i64>, <2 x i64>* %vaddr, align 1			%res = load <2 x i64>, <2 x i64>* %vaddr, align 1
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define void @test_128_9(i8 * %addr, <2 x double> %data) {			define void @test_128_9(i8 * %addr, <2 x double> %data) {
	; CHECK-LABEL: test_128_9:			; CHECK-LABEL: test_128_9:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x08,0x29,0x07]			; CHECK-NEXT: vmovaps %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x29,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <2 x double>*			%vaddr = bitcast i8* %addr to <2 x double>*
	store <2 x double>%data, <2 x double>* %vaddr, align 16			store <2 x double>%data, <2 x double>* %vaddr, align 16
	ret void			ret void
	}			}

	define <2 x double> @test_128_10(i8 * %addr) {			define <2 x double> @test_128_10(i8 * %addr) {
	; CHECK-LABEL: test_128_10:			; CHECK-LABEL: test_128_10:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <2 x double>*			%vaddr = bitcast i8* %addr to <2 x double>*
	%res = load <2 x double>, <2 x double>* %vaddr, align 16			%res = load <2 x double>, <2 x double>* %vaddr, align 16
	ret <2 x double>%res			ret <2 x double>%res
	}			}

	define void @test_128_11(i8 * %addr, <4 x float> %data) {			define void @test_128_11(i8 * %addr, <4 x float> %data) {
	; CHECK-LABEL: test_128_11:			; CHECK-LABEL: test_128_11:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x08,0x29,0x07]			; CHECK-NEXT: vmovaps %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x29,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x float>*			%vaddr = bitcast i8* %addr to <4 x float>*
	store <4 x float>%data, <4 x float>* %vaddr, align 16			store <4 x float>%data, <4 x float>* %vaddr, align 16
	ret void			ret void
	}			}

	define <4 x float> @test_128_12(i8 * %addr) {			define <4 x float> @test_128_12(i8 * %addr) {
	; CHECK-LABEL: test_128_12:			; CHECK-LABEL: test_128_12:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x float>*			%vaddr = bitcast i8* %addr to <4 x float>*
	%res = load <4 x float>, <4 x float>* %vaddr, align 16			%res = load <4 x float>, <4 x float>* %vaddr, align 16
	ret <4 x float>%res			ret <4 x float>%res
	}			}

	define void @test_128_13(i8 * %addr, <2 x double> %data) {			define void @test_128_13(i8 * %addr, <2 x double> %data) {
	; CHECK-LABEL: test_128_13:			; CHECK-LABEL: test_128_13:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x08,0x11,0x07]			; CHECK-NEXT: vmovups %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <2 x double>*			%vaddr = bitcast i8* %addr to <2 x double>*
	store <2 x double>%data, <2 x double>* %vaddr, align 1			store <2 x double>%data, <2 x double>* %vaddr, align 1
	ret void			ret void
	}			}

	define <2 x double> @test_128_14(i8 * %addr) {			define <2 x double> @test_128_14(i8 * %addr) {
	; CHECK-LABEL: test_128_14:			; CHECK-LABEL: test_128_14:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <2 x double>*			%vaddr = bitcast i8* %addr to <2 x double>*
	%res = load <2 x double>, <2 x double>* %vaddr, align 1			%res = load <2 x double>, <2 x double>* %vaddr, align 1
	ret <2 x double>%res			ret <2 x double>%res
	}			}

	define void @test_128_15(i8 * %addr, <4 x float> %data) {			define void @test_128_15(i8 * %addr, <4 x float> %data) {
	; CHECK-LABEL: test_128_15:			; CHECK-LABEL: test_128_15:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups %xmm0, (%rdi) ## encoding: [0x62,0xf1,0x7c,0x08,0x11,0x07]			; CHECK-NEXT: vmovups %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x11,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x float>*			%vaddr = bitcast i8* %addr to <4 x float>*
	store <4 x float>%data, <4 x float>* %vaddr, align 1			store <4 x float>%data, <4 x float>* %vaddr, align 1
	ret void			ret void
	}			}

	define <4 x float> @test_128_16(i8 * %addr) {			define <4 x float> @test_128_16(i8 * %addr) {
	; CHECK-LABEL: test_128_16:			; CHECK-LABEL: test_128_16:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vmovups (%rdi), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%vaddr = bitcast i8* %addr to <4 x float>*			%vaddr = bitcast i8* %addr to <4 x float>*
	%res = load <4 x float>, <4 x float>* %vaddr, align 1			%res = load <4 x float>, <4 x float>* %vaddr, align 1
	ret <4 x float>%res			ret <4 x float>%res
	}			}

	define <4 x i32> @test_128_17(i8 * %addr, <4 x i32> %old, <4 x i32> %mask1) {			define <4 x i32> @test_128_17(i8 * %addr, <4 x i32> %old, <4 x i32> %mask1) {
	; CHECK-LABEL: test_128_17:			; CHECK-LABEL: test_128_17:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x1f,0xca,0x04]
	; CHECK-NEXT: vpblendmd (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x64,0x07]			; CHECK-NEXT: vpblendmd (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x64,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i32> %mask1, zeroinitializer			%mask = icmp ne <4 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x i32>*			%vaddr = bitcast i8* %addr to <4 x i32>*
	%r = load <4 x i32>, <4 x i32>* %vaddr, align 16			%r = load <4 x i32>, <4 x i32>* %vaddr, align 16
	%res = select <4 x i1> %mask, <4 x i32> %r, <4 x i32> %old			%res = select <4 x i1> %mask, <4 x i32> %r, <4 x i32> %old
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <4 x i32> @test_128_18(i8 * %addr, <4 x i32> %old, <4 x i32> %mask1) {			define <4 x i32> @test_128_18(i8 * %addr, <4 x i32> %old, <4 x i32> %mask1) {
	; CHECK-LABEL: test_128_18:			; CHECK-LABEL: test_128_18:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x1f,0xca,0x04]
	; CHECK-NEXT: vpblendmd (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x64,0x07]			; CHECK-NEXT: vpblendmd (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x64,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i32> %mask1, zeroinitializer			%mask = icmp ne <4 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x i32>*			%vaddr = bitcast i8* %addr to <4 x i32>*
	%r = load <4 x i32>, <4 x i32>* %vaddr, align 1			%r = load <4 x i32>, <4 x i32>* %vaddr, align 1
	%res = select <4 x i1> %mask, <4 x i32> %r, <4 x i32> %old			%res = select <4 x i1> %mask, <4 x i32> %r, <4 x i32> %old
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <4 x i32> @test_128_19(i8 * %addr, <4 x i32> %mask1) {			define <4 x i32> @test_128_19(i8 * %addr, <4 x i32> %mask1) {
	; CHECK-LABEL: test_128_19:			; CHECK-LABEL: test_128_19:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqd %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqd %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovdqa32 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x6f,0x07]			; CHECK-NEXT: vmovdqa32 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7d,0x89,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i32> %mask1, zeroinitializer			%mask = icmp ne <4 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x i32>*			%vaddr = bitcast i8* %addr to <4 x i32>*
	%r = load <4 x i32>, <4 x i32>* %vaddr, align 16			%r = load <4 x i32>, <4 x i32>* %vaddr, align 16
	%res = select <4 x i1> %mask, <4 x i32> %r, <4 x i32> zeroinitializer			%res = select <4 x i1> %mask, <4 x i32> %r, <4 x i32> zeroinitializer
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <4 x i32> @test_128_20(i8 * %addr, <4 x i32> %mask1) {			define <4 x i32> @test_128_20(i8 * %addr, <4 x i32> %mask1) {
	; CHECK-LABEL: test_128_20:			; CHECK-LABEL: test_128_20:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqd %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqd %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovdqu32 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x6f,0x07]			; CHECK-NEXT: vmovdqu32 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7e,0x89,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i32> %mask1, zeroinitializer			%mask = icmp ne <4 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x i32>*			%vaddr = bitcast i8* %addr to <4 x i32>*
	%r = load <4 x i32>, <4 x i32>* %vaddr, align 1			%r = load <4 x i32>, <4 x i32>* %vaddr, align 1
	%res = select <4 x i1> %mask, <4 x i32> %r, <4 x i32> zeroinitializer			%res = select <4 x i1> %mask, <4 x i32> %r, <4 x i32> zeroinitializer
	ret <4 x i32>%res			ret <4 x i32>%res
	}			}

	define <2 x i64> @test_128_21(i8 * %addr, <2 x i64> %old, <2 x i64> %mask1) {			define <2 x i64> @test_128_21(i8 * %addr, <2 x i64> %old, <2 x i64> %mask1) {
	; CHECK-LABEL: test_128_21:			; CHECK-LABEL: test_128_21:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqq %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqq %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x1f,0xca,0x04]
	; CHECK-NEXT: vpblendmq (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x64,0x07]			; CHECK-NEXT: vpblendmq (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x64,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <2 x i64> %mask1, zeroinitializer			%mask = icmp ne <2 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <2 x i64>*			%vaddr = bitcast i8* %addr to <2 x i64>*
	%r = load <2 x i64>, <2 x i64>* %vaddr, align 16			%r = load <2 x i64>, <2 x i64>* %vaddr, align 16
	%res = select <2 x i1> %mask, <2 x i64> %r, <2 x i64> %old			%res = select <2 x i1> %mask, <2 x i64> %r, <2 x i64> %old
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define <2 x i64> @test_128_22(i8 * %addr, <2 x i64> %old, <2 x i64> %mask1) {			define <2 x i64> @test_128_22(i8 * %addr, <2 x i64> %old, <2 x i64> %mask1) {
	; CHECK-LABEL: test_128_22:			; CHECK-LABEL: test_128_22:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqq %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqq %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x1f,0xca,0x04]
	; CHECK-NEXT: vpblendmq (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x64,0x07]			; CHECK-NEXT: vpblendmq (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x64,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <2 x i64> %mask1, zeroinitializer			%mask = icmp ne <2 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <2 x i64>*			%vaddr = bitcast i8* %addr to <2 x i64>*
	%r = load <2 x i64>, <2 x i64>* %vaddr, align 1			%r = load <2 x i64>, <2 x i64>* %vaddr, align 1
	%res = select <2 x i1> %mask, <2 x i64> %r, <2 x i64> %old			%res = select <2 x i1> %mask, <2 x i64> %r, <2 x i64> %old
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define <2 x i64> @test_128_23(i8 * %addr, <2 x i64> %mask1) {			define <2 x i64> @test_128_23(i8 * %addr, <2 x i64> %mask1) {
	; CHECK-LABEL: test_128_23:			; CHECK-LABEL: test_128_23:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqq %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqq %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovdqa64 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x6f,0x07]			; CHECK-NEXT: vmovdqa64 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <2 x i64> %mask1, zeroinitializer			%mask = icmp ne <2 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <2 x i64>*			%vaddr = bitcast i8* %addr to <2 x i64>*
	%r = load <2 x i64>, <2 x i64>* %vaddr, align 16			%r = load <2 x i64>, <2 x i64>* %vaddr, align 16
	%res = select <2 x i1> %mask, <2 x i64> %r, <2 x i64> zeroinitializer			%res = select <2 x i1> %mask, <2 x i64> %r, <2 x i64> zeroinitializer
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define <2 x i64> @test_128_24(i8 * %addr, <2 x i64> %mask1) {			define <2 x i64> @test_128_24(i8 * %addr, <2 x i64> %mask1) {
	; CHECK-LABEL: test_128_24:			; CHECK-LABEL: test_128_24:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqq %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqq %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovdqu64 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfe,0x89,0x6f,0x07]			; CHECK-NEXT: vmovdqu64 (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfe,0x89,0x6f,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <2 x i64> %mask1, zeroinitializer			%mask = icmp ne <2 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <2 x i64>*			%vaddr = bitcast i8* %addr to <2 x i64>*
	%r = load <2 x i64>, <2 x i64>* %vaddr, align 1			%r = load <2 x i64>, <2 x i64>* %vaddr, align 1
	%res = select <2 x i1> %mask, <2 x i64> %r, <2 x i64> zeroinitializer			%res = select <2 x i1> %mask, <2 x i64> %r, <2 x i64> zeroinitializer
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define <4 x float> @test_128_25(i8 * %addr, <4 x float> %old, <4 x i32> %mask1) {			define <4 x float> @test_128_25(i8 * %addr, <4 x float> %old, <4 x i32> %mask1) {
	; CHECK-LABEL: test_128_25:			; CHECK-LABEL: test_128_25:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x1f,0xca,0x04]
	; CHECK-NEXT: vblendmps (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x65,0x07]			; CHECK-NEXT: vblendmps (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x65,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i32> %mask1, zeroinitializer			%mask = icmp ne <4 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x float>*			%vaddr = bitcast i8* %addr to <4 x float>*
	%r = load <4 x float>, <4 x float>* %vaddr, align 16			%r = load <4 x float>, <4 x float>* %vaddr, align 16
	%res = select <4 x i1> %mask, <4 x float> %r, <4 x float> %old			%res = select <4 x i1> %mask, <4 x float> %r, <4 x float> %old
	ret <4 x float>%res			ret <4 x float>%res
	}			}

	define <4 x float> @test_128_26(i8 * %addr, <4 x float> %old, <4 x i32> %mask1) {			define <4 x float> @test_128_26(i8 * %addr, <4 x float> %old, <4 x i32> %mask1) {
	; CHECK-LABEL: test_128_26:			; CHECK-LABEL: test_128_26:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0x75,0x08,0x1f,0xca,0x04]
	; CHECK-NEXT: vblendmps (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x65,0x07]			; CHECK-NEXT: vblendmps (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0x65,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i32> %mask1, zeroinitializer			%mask = icmp ne <4 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x float>*			%vaddr = bitcast i8* %addr to <4 x float>*
	%r = load <4 x float>, <4 x float>* %vaddr, align 1			%r = load <4 x float>, <4 x float>* %vaddr, align 1
	%res = select <4 x i1> %mask, <4 x float> %r, <4 x float> %old			%res = select <4 x i1> %mask, <4 x float> %r, <4 x float> %old
	ret <4 x float>%res			ret <4 x float>%res
	}			}

	define <4 x float> @test_128_27(i8 * %addr, <4 x i32> %mask1) {			define <4 x float> @test_128_27(i8 * %addr, <4 x i32> %mask1) {
	; CHECK-LABEL: test_128_27:			; CHECK-LABEL: test_128_27:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqd %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqd %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovaps (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x28,0x07]			; CHECK-NEXT: vmovaps (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i32> %mask1, zeroinitializer			%mask = icmp ne <4 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x float>*			%vaddr = bitcast i8* %addr to <4 x float>*
	%r = load <4 x float>, <4 x float>* %vaddr, align 16			%r = load <4 x float>, <4 x float>* %vaddr, align 16
	%res = select <4 x i1> %mask, <4 x float> %r, <4 x float> zeroinitializer			%res = select <4 x i1> %mask, <4 x float> %r, <4 x float> zeroinitializer
	ret <4 x float>%res			ret <4 x float>%res
	}			}

	define <4 x float> @test_128_28(i8 * %addr, <4 x i32> %mask1) {			define <4 x float> @test_128_28(i8 * %addr, <4 x i32> %mask1) {
	; CHECK-LABEL: test_128_28:			; CHECK-LABEL: test_128_28:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqd %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqd %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0x7d,0x08,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovups (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x10,0x07]			; CHECK-NEXT: vmovups (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0x7c,0x89,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <4 x i32> %mask1, zeroinitializer			%mask = icmp ne <4 x i32> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <4 x float>*			%vaddr = bitcast i8* %addr to <4 x float>*
	%r = load <4 x float>, <4 x float>* %vaddr, align 1			%r = load <4 x float>, <4 x float>* %vaddr, align 1
	%res = select <4 x i1> %mask, <4 x float> %r, <4 x float> zeroinitializer			%res = select <4 x i1> %mask, <4 x float> %r, <4 x float> zeroinitializer
	ret <4 x float>%res			ret <4 x float>%res
	}			}

	define <2 x double> @test_128_29(i8 * %addr, <2 x double> %old, <2 x i64> %mask1) {			define <2 x double> @test_128_29(i8 * %addr, <2 x double> %old, <2 x i64> %mask1) {
	; CHECK-LABEL: test_128_29:			; CHECK-LABEL: test_128_29:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqq %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqq %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x1f,0xca,0x04]
	; CHECK-NEXT: vblendmpd (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x65,0x07]			; CHECK-NEXT: vblendmpd (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x65,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <2 x i64> %mask1, zeroinitializer			%mask = icmp ne <2 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <2 x double>*			%vaddr = bitcast i8* %addr to <2 x double>*
	%r = load <2 x double>, <2 x double>* %vaddr, align 16			%r = load <2 x double>, <2 x double>* %vaddr, align 16
	%res = select <2 x i1> %mask, <2 x double> %r, <2 x double> %old			%res = select <2 x i1> %mask, <2 x double> %r, <2 x double> %old
	ret <2 x double>%res			ret <2 x double>%res
	}			}

	define <2 x double> @test_128_30(i8 * %addr, <2 x double> %old, <2 x i64> %mask1) {			define <2 x double> @test_128_30(i8 * %addr, <2 x double> %old, <2 x i64> %mask1) {
	; CHECK-LABEL: test_128_30:			; CHECK-LABEL: test_128_30:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2 ## encoding: [0x62,0xf1,0x6d,0x08,0xef,0xd2]			; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2 ## EVEX TO VEX Compression encoding: [0xc5,0xe9,0xef,0xd2]
	; CHECK-NEXT: vpcmpneqq %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x1f,0xca,0x04]			; CHECK-NEXT: vpcmpneqq %xmm2, %xmm1, %k1 ## encoding: [0x62,0xf3,0xf5,0x08,0x1f,0xca,0x04]
	; CHECK-NEXT: vblendmpd (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x65,0x07]			; CHECK-NEXT: vblendmpd (%rdi), %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0x65,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <2 x i64> %mask1, zeroinitializer			%mask = icmp ne <2 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <2 x double>*			%vaddr = bitcast i8* %addr to <2 x double>*
	%r = load <2 x double>, <2 x double>* %vaddr, align 1			%r = load <2 x double>, <2 x double>* %vaddr, align 1
	%res = select <2 x i1> %mask, <2 x double> %r, <2 x double> %old			%res = select <2 x i1> %mask, <2 x double> %r, <2 x double> %old
	ret <2 x double>%res			ret <2 x double>%res
	}			}

	define <2 x double> @test_128_31(i8 * %addr, <2 x i64> %mask1) {			define <2 x double> @test_128_31(i8 * %addr, <2 x i64> %mask1) {
	; CHECK-LABEL: test_128_31:			; CHECK-LABEL: test_128_31:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqq %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqq %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovapd (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x28,0x07]			; CHECK-NEXT: vmovapd (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x28,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <2 x i64> %mask1, zeroinitializer			%mask = icmp ne <2 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <2 x double>*			%vaddr = bitcast i8* %addr to <2 x double>*
	%r = load <2 x double>, <2 x double>* %vaddr, align 16			%r = load <2 x double>, <2 x double>* %vaddr, align 16
	%res = select <2 x i1> %mask, <2 x double> %r, <2 x double> zeroinitializer			%res = select <2 x i1> %mask, <2 x double> %r, <2 x double> zeroinitializer
	ret <2 x double>%res			ret <2 x double>%res
	}			}

	define <2 x double> @test_128_32(i8 * %addr, <2 x i64> %mask1) {			define <2 x double> @test_128_32(i8 * %addr, <2 x i64> %mask1) {
	; CHECK-LABEL: test_128_32:			; CHECK-LABEL: test_128_32:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: vpxord %xmm1, %xmm1, %xmm1 ## encoding: [0x62,0xf1,0x75,0x08,0xef,0xc9]			; CHECK-NEXT: vpxor %xmm1, %xmm1, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf1,0xef,0xc9]
	; CHECK-NEXT: vpcmpneqq %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x1f,0xc9,0x04]			; CHECK-NEXT: vpcmpneqq %xmm1, %xmm0, %k1 ## encoding: [0x62,0xf3,0xfd,0x08,0x1f,0xc9,0x04]
	; CHECK-NEXT: vmovupd (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x10,0x07]			; CHECK-NEXT: vmovupd (%rdi), %xmm0 {%k1} {z} ## encoding: [0x62,0xf1,0xfd,0x89,0x10,0x07]
	; CHECK-NEXT: retq ## encoding: [0xc3]			; CHECK-NEXT: retq ## encoding: [0xc3]
	%mask = icmp ne <2 x i64> %mask1, zeroinitializer			%mask = icmp ne <2 x i64> %mask1, zeroinitializer
	%vaddr = bitcast i8* %addr to <2 x double>*			%vaddr = bitcast i8* %addr to <2 x double>*
	%r = load <2 x double>, <2 x double>* %vaddr, align 1			%r = load <2 x double>, <2 x double>* %vaddr, align 1
	%res = select <2 x i1> %mask, <2 x double> %r, <2 x double> zeroinitializer			%res = select <2 x i1> %mask, <2 x double> %r, <2 x double> zeroinitializer
	ret <2 x double>%res			ret <2 x double>%res
	}			}

llvm/trunk/test/CodeGen/X86/avx512vl-nontemporal.ll

	; RUN: llc < %s -march=x86-64 -mtriple=x86_64-apple-darwin -mcpu=skx --show-mc-encoding \| FileCheck %s			; RUN: llc < %s -march=x86-64 -mtriple=x86_64-apple-darwin -mcpu=skx --show-mc-encoding \| FileCheck %s

	define void @f256(<8 x float> %A, <8 x float> %AA, i8* %B, <4 x double> %C, <4 x double> %CC, i32 %D, <4 x i64> %E, <4 x i64> %EE) {			define void @f256(<8 x float> %A, <8 x float> %AA, i8* %B, <4 x double> %C, <4 x double> %CC, i32 %D, <4 x i64> %E, <4 x i64> %EE) {
	; CHECK: vmovntps %ymm{{.*}} ## encoding: [0x62			; CHECK: vmovntps %ymm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	%cast = bitcast i8* %B to <8 x float>*			%cast = bitcast i8* %B to <8 x float>*
	%A2 = fadd <8 x float> %A, %AA			%A2 = fadd <8 x float> %A, %AA
	store <8 x float> %A2, <8 x float>* %cast, align 64, !nontemporal !0			store <8 x float> %A2, <8 x float>* %cast, align 64, !nontemporal !0
	; CHECK: vmovntdq %ymm{{.*}} ## encoding: [0x62			; CHECK: vmovntdq %ymm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	%cast1 = bitcast i8* %B to <4 x i64>*			%cast1 = bitcast i8* %B to <4 x i64>*
	%E2 = add <4 x i64> %E, %EE			%E2 = add <4 x i64> %E, %EE
	store <4 x i64> %E2, <4 x i64>* %cast1, align 64, !nontemporal !0			store <4 x i64> %E2, <4 x i64>* %cast1, align 64, !nontemporal !0
	; CHECK: vmovntpd %ymm{{.*}} ## encoding: [0x62			; CHECK: vmovntpd %ymm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	%cast2 = bitcast i8* %B to <4 x double>*			%cast2 = bitcast i8* %B to <4 x double>*
	%C2 = fadd <4 x double> %C, %CC			%C2 = fadd <4 x double> %C, %CC
	store <4 x double> %C2, <4 x double>* %cast2, align 64, !nontemporal !0			store <4 x double> %C2, <4 x double>* %cast2, align 64, !nontemporal !0
	ret void			ret void
	}			}

	define void @f128(<4 x float> %A, <4 x float> %AA, i8* %B, <2 x double> %C, <2 x double> %CC, i32 %D, <2 x i64> %E, <2 x i64> %EE) {			define void @f128(<4 x float> %A, <4 x float> %AA, i8* %B, <2 x double> %C, <2 x double> %CC, i32 %D, <2 x i64> %E, <2 x i64> %EE) {
	; CHECK: vmovntps %xmm{{.*}} ## encoding: [0x62			; CHECK: vmovntps %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	%cast = bitcast i8* %B to <4 x float>*			%cast = bitcast i8* %B to <4 x float>*
	%A2 = fadd <4 x float> %A, %AA			%A2 = fadd <4 x float> %A, %AA
	store <4 x float> %A2, <4 x float>* %cast, align 64, !nontemporal !0			store <4 x float> %A2, <4 x float>* %cast, align 64, !nontemporal !0
	; CHECK: vmovntdq %xmm{{.*}} ## encoding: [0x62			; CHECK: vmovntdq %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	%cast1 = bitcast i8* %B to <2 x i64>*			%cast1 = bitcast i8* %B to <2 x i64>*
	%E2 = add <2 x i64> %E, %EE			%E2 = add <2 x i64> %E, %EE
	store <2 x i64> %E2, <2 x i64>* %cast1, align 64, !nontemporal !0			store <2 x i64> %E2, <2 x i64>* %cast1, align 64, !nontemporal !0
	; CHECK: vmovntpd %xmm{{.*}} ## encoding: [0x62			; CHECK: vmovntpd %xmm{{.*}} ## EVEX TO VEX Compression encoding: [0xc5
	%cast2 = bitcast i8* %B to <2 x double>*			%cast2 = bitcast i8* %B to <2 x double>*
	%C2 = fadd <2 x double> %C, %CC			%C2 = fadd <2 x double> %C, %CC
	store <2 x double> %C2, <2 x double>* %cast2, align 64, !nontemporal !0			store <2 x double> %C2, <2 x double>* %cast2, align 64, !nontemporal !0
	ret void			ret void
	}			}
	!0 = !{i32 1}			!0 = !{i32 1}

llvm/trunk/test/CodeGen/X86/avx512vl-vbroadcast.ll

Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq
%b = insertelement <8 x float> undef, float %a, i32 0		%b = insertelement <8 x float> undef, float %a, i32 0
%c = shufflevector <8 x float> %b, <8 x float> undef, <8 x i32> zeroinitializer		%c = shufflevector <8 x float> %b, <8 x float> undef, <8 x i32> zeroinitializer
ret <8 x float> %c		ret <8 x float> %c
}		}

define <8 x float> @_ss8xfloat_mask(<8 x float> %i, float %a, <8 x i32> %mask1) {		define <8 x float> @_ss8xfloat_mask(<8 x float> %i, float %a, <8 x i32> %mask1) {
; CHECK-LABEL: _ss8xfloat_mask:		; CHECK-LABEL: _ss8xfloat_mask:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: vpxord %ymm3, %ymm3, %ymm3		; CHECK-NEXT: vpxor %ymm3, %ymm3, %ymm3
; CHECK-NEXT: vpcmpneqd %ymm3, %ymm2, %k1		; CHECK-NEXT: vpcmpneqd %ymm3, %ymm2, %k1
; CHECK-NEXT: vbroadcastss %xmm1, %ymm0 {%k1}		; CHECK-NEXT: vbroadcastss %xmm1, %ymm0 {%k1}
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%mask = icmp ne <8 x i32> %mask1, zeroinitializer		%mask = icmp ne <8 x i32> %mask1, zeroinitializer
%b = insertelement <8 x float> undef, float %a, i32 0		%b = insertelement <8 x float> undef, float %a, i32 0
%c = shufflevector <8 x float> %b, <8 x float> undef, <8 x i32> zeroinitializer		%c = shufflevector <8 x float> %b, <8 x float> undef, <8 x i32> zeroinitializer
%r = select <8 x i1> %mask, <8 x float> %c, <8 x float> %i		%r = select <8 x i1> %mask, <8 x float> %c, <8 x float> %i
ret <8 x float> %r		ret <8 x float> %r
}		}

define <8 x float> @_ss8xfloat_maskz(float %a, <8 x i32> %mask1) {		define <8 x float> @_ss8xfloat_maskz(float %a, <8 x i32> %mask1) {
; CHECK-LABEL: _ss8xfloat_maskz:		; CHECK-LABEL: _ss8xfloat_maskz:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: vpxord %ymm2, %ymm2, %ymm2		; CHECK-NEXT: vpxor %ymm2, %ymm2, %ymm2
; CHECK-NEXT: vpcmpneqd %ymm2, %ymm1, %k1		; CHECK-NEXT: vpcmpneqd %ymm2, %ymm1, %k1
; CHECK-NEXT: vbroadcastss %xmm0, %ymm0 {%k1} {z}		; CHECK-NEXT: vbroadcastss %xmm0, %ymm0 {%k1} {z}
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%mask = icmp ne <8 x i32> %mask1, zeroinitializer		%mask = icmp ne <8 x i32> %mask1, zeroinitializer
%b = insertelement <8 x float> undef, float %a, i32 0		%b = insertelement <8 x float> undef, float %a, i32 0
%c = shufflevector <8 x float> %b, <8 x float> undef, <8 x i32> zeroinitializer		%c = shufflevector <8 x float> %b, <8 x float> undef, <8 x i32> zeroinitializer
%r = select <8 x i1> %mask, <8 x float> %c, <8 x float> zeroinitializer		%r = select <8 x i1> %mask, <8 x float> %c, <8 x float> zeroinitializer
ret <8 x float> %r		ret <8 x float> %r
}		}

define <4 x float> @_inreg4xfloat(float %a) {		define <4 x float> @_inreg4xfloat(float %a) {
; CHECK-LABEL: _inreg4xfloat:		; CHECK-LABEL: _inreg4xfloat:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: vbroadcastss %xmm0, %xmm0		; CHECK-NEXT: vbroadcastss %xmm0, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%b = insertelement <4 x float> undef, float %a, i32 0		%b = insertelement <4 x float> undef, float %a, i32 0
%c = shufflevector <4 x float> %b, <4 x float> undef, <4 x i32> zeroinitializer		%c = shufflevector <4 x float> %b, <4 x float> undef, <4 x i32> zeroinitializer
ret <4 x float> %c		ret <4 x float> %c
}		}

define <4 x float> @_ss4xfloat_mask(<4 x float> %i, float %a, <4 x i32> %mask1) {		define <4 x float> @_ss4xfloat_mask(<4 x float> %i, float %a, <4 x i32> %mask1) {
; CHECK-LABEL: _ss4xfloat_mask:		; CHECK-LABEL: _ss4xfloat_mask:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: vpxord %xmm3, %xmm3, %xmm3		; CHECK-NEXT: vpxor %xmm3, %xmm3, %xmm3
; CHECK-NEXT: vpcmpneqd %xmm3, %xmm2, %k1		; CHECK-NEXT: vpcmpneqd %xmm3, %xmm2, %k1
; CHECK-NEXT: vbroadcastss %xmm1, %xmm0 {%k1}		; CHECK-NEXT: vbroadcastss %xmm1, %xmm0 {%k1}
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%mask = icmp ne <4 x i32> %mask1, zeroinitializer		%mask = icmp ne <4 x i32> %mask1, zeroinitializer
%b = insertelement <4 x float> undef, float %a, i32 0		%b = insertelement <4 x float> undef, float %a, i32 0
%c = shufflevector <4 x float> %b, <4 x float> undef, <4 x i32> zeroinitializer		%c = shufflevector <4 x float> %b, <4 x float> undef, <4 x i32> zeroinitializer
%r = select <4 x i1> %mask, <4 x float> %c, <4 x float> %i		%r = select <4 x i1> %mask, <4 x float> %c, <4 x float> %i
ret <4 x float> %r		ret <4 x float> %r
}		}

define <4 x float> @_ss4xfloat_maskz(float %a, <4 x i32> %mask1) {		define <4 x float> @_ss4xfloat_maskz(float %a, <4 x i32> %mask1) {
; CHECK-LABEL: _ss4xfloat_maskz:		; CHECK-LABEL: _ss4xfloat_maskz:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2		; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2
; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1		; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1
; CHECK-NEXT: vbroadcastss %xmm0, %xmm0 {%k1} {z}		; CHECK-NEXT: vbroadcastss %xmm0, %xmm0 {%k1} {z}
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%mask = icmp ne <4 x i32> %mask1, zeroinitializer		%mask = icmp ne <4 x i32> %mask1, zeroinitializer
%b = insertelement <4 x float> undef, float %a, i32 0		%b = insertelement <4 x float> undef, float %a, i32 0
%c = shufflevector <4 x float> %b, <4 x float> undef, <4 x i32> zeroinitializer		%c = shufflevector <4 x float> %b, <4 x float> undef, <4 x i32> zeroinitializer
%r = select <4 x i1> %mask, <4 x float> %c, <4 x float> zeroinitializer		%r = select <4 x i1> %mask, <4 x float> %c, <4 x float> zeroinitializer
ret <4 x float> %r		ret <4 x float> %r
}		}

define <4 x double> @_inreg4xdouble(double %a) {		define <4 x double> @_inreg4xdouble(double %a) {
; CHECK-LABEL: _inreg4xdouble:		; CHECK-LABEL: _inreg4xdouble:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: vbroadcastsd %xmm0, %ymm0		; CHECK-NEXT: vbroadcastsd %xmm0, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%b = insertelement <4 x double> undef, double %a, i32 0		%b = insertelement <4 x double> undef, double %a, i32 0
%c = shufflevector <4 x double> %b, <4 x double> undef, <4 x i32> zeroinitializer		%c = shufflevector <4 x double> %b, <4 x double> undef, <4 x i32> zeroinitializer
ret <4 x double> %c		ret <4 x double> %c
}		}

define <4 x double> @_ss4xdouble_mask(<4 x double> %i, double %a, <4 x i32> %mask1) {		define <4 x double> @_ss4xdouble_mask(<4 x double> %i, double %a, <4 x i32> %mask1) {
; CHECK-LABEL: _ss4xdouble_mask:		; CHECK-LABEL: _ss4xdouble_mask:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: vpxord %xmm3, %xmm3, %xmm3		; CHECK-NEXT: vpxor %xmm3, %xmm3, %xmm3
; CHECK-NEXT: vpcmpneqd %xmm3, %xmm2, %k1		; CHECK-NEXT: vpcmpneqd %xmm3, %xmm2, %k1
; CHECK-NEXT: vbroadcastsd %xmm1, %ymm0 {%k1}		; CHECK-NEXT: vbroadcastsd %xmm1, %ymm0 {%k1}
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%mask = icmp ne <4 x i32> %mask1, zeroinitializer		%mask = icmp ne <4 x i32> %mask1, zeroinitializer
%b = insertelement <4 x double> undef, double %a, i32 0		%b = insertelement <4 x double> undef, double %a, i32 0
%c = shufflevector <4 x double> %b, <4 x double> undef, <4 x i32> zeroinitializer		%c = shufflevector <4 x double> %b, <4 x double> undef, <4 x i32> zeroinitializer
%r = select <4 x i1> %mask, <4 x double> %c, <4 x double> %i		%r = select <4 x i1> %mask, <4 x double> %c, <4 x double> %i
ret <4 x double> %r		ret <4 x double> %r
}		}

define <4 x double> @_ss4xdouble_maskz(double %a, <4 x i32> %mask1) {		define <4 x double> @_ss4xdouble_maskz(double %a, <4 x i32> %mask1) {
; CHECK-LABEL: _ss4xdouble_maskz:		; CHECK-LABEL: _ss4xdouble_maskz:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: vpxord %xmm2, %xmm2, %xmm2		; CHECK-NEXT: vpxor %xmm2, %xmm2, %xmm2
; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1		; CHECK-NEXT: vpcmpneqd %xmm2, %xmm1, %k1
; CHECK-NEXT: vbroadcastsd %xmm0, %ymm0 {%k1} {z}		; CHECK-NEXT: vbroadcastsd %xmm0, %ymm0 {%k1} {z}
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%mask = icmp ne <4 x i32> %mask1, zeroinitializer		%mask = icmp ne <4 x i32> %mask1, zeroinitializer
%b = insertelement <4 x double> undef, double %a, i32 0		%b = insertelement <4 x double> undef, double %a, i32 0
%c = shufflevector <4 x double> %b, <4 x double> undef, <4 x i32> zeroinitializer		%c = shufflevector <4 x double> %b, <4 x double> undef, <4 x i32> zeroinitializer
%r = select <4 x i1> %mask, <4 x double> %c, <4 x double> zeroinitializer		%r = select <4 x i1> %mask, <4 x double> %c, <4 x double> zeroinitializer
ret <4 x double> %r		ret <4 x double> %r
}		}

llvm/trunk/test/CodeGen/X86/compress_expand.ll

Show First 20 Lines • Show All 232 Lines • ▼ Show 20 Lines
; KNL-NEXT: retq		; KNL-NEXT: retq
call void @llvm.masked.compressstore.v4f32(<4 x float> %V, float* %base, <4 x i1> %mask)		call void @llvm.masked.compressstore.v4f32(<4 x float> %V, float* %base, <4 x i1> %mask)
ret void		ret void
}		}

define <2 x float> @test13(float* %base, <2 x float> %src0, <2 x i32> %trigger) {		define <2 x float> @test13(float* %base, <2 x float> %src0, <2 x i32> %trigger) {
; SKX-LABEL: test13:		; SKX-LABEL: test13:
; SKX: # BB#0:		; SKX: # BB#0:
; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2		; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
; SKX-NEXT: vpblendd {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3]		; SKX-NEXT: vpblendd {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3]
; SKX-NEXT: vpcmpeqq %xmm2, %xmm1, %k0		; SKX-NEXT: vpcmpeqq %xmm2, %xmm1, %k0
; SKX-NEXT: kshiftlb $6, %k0, %k0		; SKX-NEXT: kshiftlb $6, %k0, %k0
; SKX-NEXT: kshiftrb $6, %k0, %k1		; SKX-NEXT: kshiftrb $6, %k0, %k1
; SKX-NEXT: vexpandps (%rdi), %xmm0 {%k1}		; SKX-NEXT: vexpandps (%rdi), %xmm0 {%k1}
; SKX-NEXT: retq		; SKX-NEXT: retq
;		;
; KNL-LABEL: test13:		; KNL-LABEL: test13:
Show All 13 Lines	; KNL-NEXT: retq
%mask = icmp eq <2 x i32> %trigger, zeroinitializer		%mask = icmp eq <2 x i32> %trigger, zeroinitializer
%res = call <2 x float> @llvm.masked.expandload.v2f32(float* %base, <2 x i1> %mask, <2 x float> %src0)		%res = call <2 x float> @llvm.masked.expandload.v2f32(float* %base, <2 x i1> %mask, <2 x float> %src0)
ret <2 x float> %res		ret <2 x float> %res
}		}

define void @test14(float* %base, <2 x float> %V, <2 x i32> %trigger) {		define void @test14(float* %base, <2 x float> %V, <2 x i32> %trigger) {
; SKX-LABEL: test14:		; SKX-LABEL: test14:
; SKX: # BB#0:		; SKX: # BB#0:
; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2		; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
; SKX-NEXT: vpblendd {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3]		; SKX-NEXT: vpblendd {{.*#+}} xmm1 = xmm1[0],xmm2[1],xmm1[2],xmm2[3]
; SKX-NEXT: vpcmpeqq %xmm2, %xmm1, %k0		; SKX-NEXT: vpcmpeqq %xmm2, %xmm1, %k0
; SKX-NEXT: kshiftlb $6, %k0, %k0		; SKX-NEXT: kshiftlb $6, %k0, %k0
; SKX-NEXT: kshiftrb $6, %k0, %k1		; SKX-NEXT: kshiftrb $6, %k0, %k1
; SKX-NEXT: vcompressps %xmm0, (%rdi) {%k1}		; SKX-NEXT: vcompressps %xmm0, (%rdi) {%k1}
; SKX-NEXT: retq		; SKX-NEXT: retq
;		;
; KNL-LABEL: test14:		; KNL-LABEL: test14:
Show All 29 Lines	; ALL-NEXT: retq
%res = call <32 x float> @llvm.masked.expandload.v32f32(float* %base, <32 x i1> %mask, <32 x float> %src0)		%res = call <32 x float> @llvm.masked.expandload.v32f32(float* %base, <32 x i1> %mask, <32 x float> %src0)
ret <32 x float> %res		ret <32 x float> %res
}		}

define <16 x double> @test16(double* %base, <16 x double> %src0, <16 x i32> %trigger) {		define <16 x double> @test16(double* %base, <16 x double> %src0, <16 x i32> %trigger) {
; SKX-LABEL: test16:		; SKX-LABEL: test16:
; SKX: # BB#0:		; SKX: # BB#0:
; SKX-NEXT: vextracti32x8 $1, %zmm2, %ymm3		; SKX-NEXT: vextracti32x8 $1, %zmm2, %ymm3
; SKX-NEXT: vpxord %ymm4, %ymm4, %ymm4		; SKX-NEXT: vpxor %ymm4, %ymm4, %ymm4
; SKX-NEXT: vpcmpeqd %ymm4, %ymm3, %k1		; SKX-NEXT: vpcmpeqd %ymm4, %ymm3, %k1
; SKX-NEXT: vpcmpeqd %ymm4, %ymm2, %k2		; SKX-NEXT: vpcmpeqd %ymm4, %ymm2, %k2
; SKX-NEXT: kmovb %k2, %eax		; SKX-NEXT: kmovb %k2, %eax
; SKX-NEXT: popcntl %eax, %eax		; SKX-NEXT: popcntl %eax, %eax
; SKX-NEXT: vexpandpd (%rdi,%rax,8), %zmm1 {%k1}		; SKX-NEXT: vexpandpd (%rdi,%rax,8), %zmm1 {%k1}
; SKX-NEXT: vexpandpd (%rdi), %zmm0 {%k2}		; SKX-NEXT: vexpandpd (%rdi), %zmm0 {%k2}
; SKX-NEXT: retq		; SKX-NEXT: retq
;		;
▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/evex-to-vex-compress.mir

Property	Old Value	New Value
svn:executable	null	*

This file has a very large number of changes (4,485 lines). Show File Contents

llvm/trunk/test/CodeGen/X86/fast-isel-store.ll

	Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	; KNL32: # BB#0:			; KNL32: # BB#0:
	; KNL32-NEXT: vpaddd %xmm1, %xmm0, %xmm0			; KNL32-NEXT: vpaddd %xmm1, %xmm0, %xmm0
	; KNL32-NEXT: vmovdqu %xmm0, (%rdi)			; KNL32-NEXT: vmovdqu %xmm0, (%rdi)
	; KNL32-NEXT: retq			; KNL32-NEXT: retq
	;			;
	; SKX32-LABEL: test_store_4xi32:			; SKX32-LABEL: test_store_4xi32:
	; SKX32: # BB#0:			; SKX32: # BB#0:
	; SKX32-NEXT: vpaddd %xmm1, %xmm0, %xmm0			; SKX32-NEXT: vpaddd %xmm1, %xmm0, %xmm0
	; SKX32-NEXT: vmovdqu64 %xmm0, (%rdi)			; SKX32-NEXT: vmovdqu %xmm0, (%rdi)
	; SKX32-NEXT: retq			; SKX32-NEXT: retq
	%foo = add <4 x i32> %value, %value2 ; to force integer type on store			%foo = add <4 x i32> %value, %value2 ; to force integer type on store
	store <4 x i32> %foo, <4 x i32>* %addr, align 1			store <4 x i32> %foo, <4 x i32>* %addr, align 1
	ret <4 x i32> %foo			ret <4 x i32> %foo
	}			}

	define <4 x i32> @test_store_4xi32_aligned(<4 x i32>* nocapture %addr, <4 x i32> %value, <4 x i32> %value2) {			define <4 x i32> @test_store_4xi32_aligned(<4 x i32>* nocapture %addr, <4 x i32> %value, <4 x i32> %value2) {
	; SSE32-LABEL: test_store_4xi32_aligned:			; SSE32-LABEL: test_store_4xi32_aligned:
	Show All 26 Lines
	; KNL32: # BB#0:			; KNL32: # BB#0:
	; KNL32-NEXT: vpaddd %xmm1, %xmm0, %xmm0			; KNL32-NEXT: vpaddd %xmm1, %xmm0, %xmm0
	; KNL32-NEXT: vmovdqa %xmm0, (%rdi)			; KNL32-NEXT: vmovdqa %xmm0, (%rdi)
	; KNL32-NEXT: retq			; KNL32-NEXT: retq
	;			;
	; SKX32-LABEL: test_store_4xi32_aligned:			; SKX32-LABEL: test_store_4xi32_aligned:
	; SKX32: # BB#0:			; SKX32: # BB#0:
	; SKX32-NEXT: vpaddd %xmm1, %xmm0, %xmm0			; SKX32-NEXT: vpaddd %xmm1, %xmm0, %xmm0
	; SKX32-NEXT: vmovdqa64 %xmm0, (%rdi)			; SKX32-NEXT: vmovdqa %xmm0, (%rdi)
	; SKX32-NEXT: retq			; SKX32-NEXT: retq
	%foo = add <4 x i32> %value, %value2 ; to force integer type on store			%foo = add <4 x i32> %value, %value2 ; to force integer type on store
	store <4 x i32> %foo, <4 x i32>* %addr, align 16			store <4 x i32> %foo, <4 x i32>* %addr, align 16
	ret <4 x i32> %foo			ret <4 x i32> %foo
	}			}

	define <4 x float> @test_store_4xf32(<4 x float>* nocapture %addr, <4 x float> %value) {			define <4 x float> @test_store_4xf32(<4 x float>* nocapture %addr, <4 x float> %value) {
	; SSE32-LABEL: test_store_4xf32:			; SSE32-LABEL: test_store_4xf32:
	▲ Show 20 Lines • Show All 655 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/fp-logic-replace.ll

	Show All 16 Lines
	;			;
	; AVX-LABEL: FsANDPSrr:			; AVX-LABEL: FsANDPSrr:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vandps %xmm1, %xmm0, %xmm0 # encoding: [0xc5,0xf8,0x54,0xc1]			; AVX-NEXT: vandps %xmm1, %xmm0, %xmm0 # encoding: [0xc5,0xf8,0x54,0xc1]
	; AVX-NEXT: retq # encoding: [0xc3]			; AVX-NEXT: retq # encoding: [0xc3]
	;			;
	; AVX512DQ-LABEL: FsANDPSrr:			; AVX512DQ-LABEL: FsANDPSrr:
	; AVX512DQ: # BB#0:			; AVX512DQ: # BB#0:
	; AVX512DQ-NEXT: vandps %xmm1, %xmm0, %xmm0 # encoding: [0x62,0xf1,0x7c,0x08,0x54,0xc1]			; AVX512DQ-NEXT: vandps %xmm1, %xmm0, %xmm0 # EVEX TO VEX Compression encoding: [0xc5,0xf8,0x54,0xc1]
	; AVX512DQ-NEXT: retq # encoding: [0xc3]			; AVX512DQ-NEXT: retq # encoding: [0xc3]
	;
	%bc1 = bitcast double %x to i64			%bc1 = bitcast double %x to i64
	%bc2 = bitcast double %y to i64			%bc2 = bitcast double %y to i64
	%and = and i64 %bc1, %bc2			%and = and i64 %bc1, %bc2
	%bc3 = bitcast i64 %and to double			%bc3 = bitcast i64 %and to double
	ret double %bc3			ret double %bc3
	}			}

	define double @FsANDNPSrr(double %x, double %y) {			define double @FsANDNPSrr(double %x, double %y) {
	; SSE-LABEL: FsANDNPSrr:			; SSE-LABEL: FsANDNPSrr:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: andnps %xmm0, %xmm1 # encoding: [0x0f,0x55,0xc8]			; SSE-NEXT: andnps %xmm0, %xmm1 # encoding: [0x0f,0x55,0xc8]
	; SSE-NEXT: movaps %xmm1, %xmm0 # encoding: [0x0f,0x28,0xc1]			; SSE-NEXT: movaps %xmm1, %xmm0 # encoding: [0x0f,0x28,0xc1]
	; SSE-NEXT: retq # encoding: [0xc3]			; SSE-NEXT: retq # encoding: [0xc3]
	;			;
	; AVX-LABEL: FsANDNPSrr:			; AVX-LABEL: FsANDNPSrr:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vandnps %xmm0, %xmm1, %xmm0 # encoding: [0xc5,0xf0,0x55,0xc0]			; AVX-NEXT: vandnps %xmm0, %xmm1, %xmm0 # encoding: [0xc5,0xf0,0x55,0xc0]
	; AVX-NEXT: retq # encoding: [0xc3]			; AVX-NEXT: retq # encoding: [0xc3]
	;			;
	; AVX512DQ-LABEL: FsANDNPSrr:			; AVX512DQ-LABEL: FsANDNPSrr:
	; AVX512DQ: # BB#0:			; AVX512DQ: # BB#0:
	; AVX512DQ-NEXT: vandnps %xmm0, %xmm1, %xmm0 # encoding: [0x62,0xf1,0x74,0x08,0x55,0xc0]			; AVX512DQ-NEXT: vandnps %xmm0, %xmm1, %xmm0 # EVEX TO VEX Compression encoding: [0xc5,0xf0,0x55,0xc0]
	; AVX512DQ-NEXT: retq # encoding: [0xc3]			; AVX512DQ-NEXT: retq # encoding: [0xc3]
	;
	%bc1 = bitcast double %x to i64			%bc1 = bitcast double %x to i64
	%bc2 = bitcast double %y to i64			%bc2 = bitcast double %y to i64
	%not = xor i64 %bc2, -1			%not = xor i64 %bc2, -1
	%and = and i64 %bc1, %not			%and = and i64 %bc1, %not
	%bc3 = bitcast i64 %and to double			%bc3 = bitcast i64 %and to double
	ret double %bc3			ret double %bc3
	}			}

	define double @FsORPSrr(double %x, double %y) {			define double @FsORPSrr(double %x, double %y) {
	; SSE-LABEL: FsORPSrr:			; SSE-LABEL: FsORPSrr:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: orps %xmm1, %xmm0 # encoding: [0x0f,0x56,0xc1]			; SSE-NEXT: orps %xmm1, %xmm0 # encoding: [0x0f,0x56,0xc1]
	; SSE-NEXT: retq # encoding: [0xc3]			; SSE-NEXT: retq # encoding: [0xc3]
	;			;
	; AVX-LABEL: FsORPSrr:			; AVX-LABEL: FsORPSrr:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vorps %xmm1, %xmm0, %xmm0 # encoding: [0xc5,0xf8,0x56,0xc1]			; AVX-NEXT: vorps %xmm1, %xmm0, %xmm0 # encoding: [0xc5,0xf8,0x56,0xc1]
	; AVX-NEXT: retq # encoding: [0xc3]			; AVX-NEXT: retq # encoding: [0xc3]
	;			;
	; AVX512DQ-LABEL: FsORPSrr:			; AVX512DQ-LABEL: FsORPSrr:
	; AVX512DQ: # BB#0:			; AVX512DQ: # BB#0:
	; AVX512DQ-NEXT: vorps %xmm1, %xmm0, %xmm0 # encoding: [0x62,0xf1,0x7c,0x08,0x56,0xc1]			; AVX512DQ-NEXT: vorps %xmm1, %xmm0, %xmm0 # EVEX TO VEX Compression encoding: [0xc5,0xf8,0x56,0xc1]
	; AVX512DQ-NEXT: retq # encoding: [0xc3]			; AVX512DQ-NEXT: retq # encoding: [0xc3]
	;
	%bc1 = bitcast double %x to i64			%bc1 = bitcast double %x to i64
	%bc2 = bitcast double %y to i64			%bc2 = bitcast double %y to i64
	%or = or i64 %bc1, %bc2			%or = or i64 %bc1, %bc2
	%bc3 = bitcast i64 %or to double			%bc3 = bitcast i64 %or to double
	ret double %bc3			ret double %bc3
	}			}

	define double @FsXORPSrr(double %x, double %y) {			define double @FsXORPSrr(double %x, double %y) {
	; SSE-LABEL: FsXORPSrr:			; SSE-LABEL: FsXORPSrr:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm1, %xmm0 # encoding: [0x0f,0x57,0xc1]			; SSE-NEXT: xorps %xmm1, %xmm0 # encoding: [0x0f,0x57,0xc1]
	; SSE-NEXT: retq # encoding: [0xc3]			; SSE-NEXT: retq # encoding: [0xc3]
	;			;
	; AVX-LABEL: FsXORPSrr:			; AVX-LABEL: FsXORPSrr:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %xmm1, %xmm0, %xmm0 # encoding: [0xc5,0xf8,0x57,0xc1]			; AVX-NEXT: vxorps %xmm1, %xmm0, %xmm0 # encoding: [0xc5,0xf8,0x57,0xc1]
	; AVX-NEXT: retq # encoding: [0xc3]			; AVX-NEXT: retq # encoding: [0xc3]
	;			;
	; AVX512DQ-LABEL: FsXORPSrr:			; AVX512DQ-LABEL: FsXORPSrr:
	; AVX512DQ: # BB#0:			; AVX512DQ: # BB#0:
	; AVX512DQ-NEXT: vxorps %xmm1, %xmm0, %xmm0 # encoding: [0x62,0xf1,0x7c,0x08,0x57,0xc1]			; AVX512DQ-NEXT: vxorps %xmm1, %xmm0, %xmm0 # EVEX TO VEX Compression encoding: [0xc5,0xf8,0x57,0xc1]
	; AVX512DQ-NEXT: retq # encoding: [0xc3]			; AVX512DQ-NEXT: retq # encoding: [0xc3]
	;
	%bc1 = bitcast double %x to i64			%bc1 = bitcast double %x to i64
	%bc2 = bitcast double %y to i64			%bc2 = bitcast double %y to i64
	%xor = xor i64 %bc1, %bc2			%xor = xor i64 %bc1, %bc2
	%bc3 = bitcast i64 %xor to double			%bc3 = bitcast i64 %xor to double
	ret double %bc3			ret double %bc3
	}			}

llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll

	Show First 20 Lines • Show All 304 Lines • ▼ Show 20 Lines
	; KNL_32-NEXT: retl			; KNL_32-NEXT: retl
	;			;
	; SKX-LABEL: test6:			; SKX-LABEL: test6:
	; SKX: # BB#0:			; SKX: # BB#0:
	; SKX-NEXT: kxnorw %k0, %k0, %k1			; SKX-NEXT: kxnorw %k0, %k0, %k1
	; SKX-NEXT: kxnorw %k0, %k0, %k2			; SKX-NEXT: kxnorw %k0, %k0, %k2
	; SKX-NEXT: vpgatherqd (,%zmm1), %ymm2 {%k2}			; SKX-NEXT: vpgatherqd (,%zmm1), %ymm2 {%k2}
	; SKX-NEXT: vpscatterqd %ymm0, (,%zmm1) {%k1}			; SKX-NEXT: vpscatterqd %ymm0, (,%zmm1) {%k1}
	; SKX-NEXT: vmovdqa64 %ymm2, %ymm0			; SKX-NEXT: vmovdqa %ymm2, %ymm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	;			;
	; SKX_32-LABEL: test6:			; SKX_32-LABEL: test6:
	; SKX_32: # BB#0:			; SKX_32: # BB#0:
	; SKX_32-NEXT: kxnorw %k0, %k0, %k1			; SKX_32-NEXT: kxnorw %k0, %k0, %k1
	; SKX_32-NEXT: kxnorw %k0, %k0, %k2			; SKX_32-NEXT: kxnorw %k0, %k0, %k2
	; SKX_32-NEXT: vpgatherdd (,%ymm1), %ymm2 {%k2}			; SKX_32-NEXT: vpgatherdd (,%ymm1), %ymm2 {%k2}
	; SKX_32-NEXT: vpscatterdd %ymm0, (,%ymm1) {%k1}			; SKX_32-NEXT: vpscatterdd %ymm0, (,%ymm1) {%k1}
	; SKX_32-NEXT: vmovdqa64 %ymm2, %ymm0			; SKX_32-NEXT: vmovdqa %ymm2, %ymm0
	; SKX_32-NEXT: retl			; SKX_32-NEXT: retl

	%a = call <8 x i32> @llvm.masked.gather.v8i32(<8 x i32*> %ptr, i32 4, <8 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>, <8 x i32> undef)			%a = call <8 x i32> @llvm.masked.gather.v8i32(<8 x i32*> %ptr, i32 4, <8 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>, <8 x i32> undef)

	call void @llvm.masked.scatter.v8i32(<8 x i32> %a1, <8 x i32*> %ptr, i32 4, <8 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>)			call void @llvm.masked.scatter.v8i32(<8 x i32> %a1, <8 x i32*> %ptr, i32 4, <8 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>)
	ret <8 x i32>%a			ret <8 x i32>%a
	}			}

	Show All 23 Lines
	; KNL_32-NEXT: vpaddd %ymm2, %ymm1, %ymm0			; KNL_32-NEXT: vpaddd %ymm2, %ymm1, %ymm0
	; KNL_32-NEXT: retl			; KNL_32-NEXT: retl
	;			;
	; SKX-LABEL: test7:			; SKX-LABEL: test7:
	; SKX: # BB#0:			; SKX: # BB#0:
	; SKX-NEXT: kmovb %esi, %k1			; SKX-NEXT: kmovb %esi, %k1
	; SKX-NEXT: kmovw %k1, %k2			; SKX-NEXT: kmovw %k1, %k2
	; SKX-NEXT: vpgatherdd (%rdi,%ymm0,4), %ymm1 {%k2}			; SKX-NEXT: vpgatherdd (%rdi,%ymm0,4), %ymm1 {%k2}
	; SKX-NEXT: vmovdqa64 %ymm1, %ymm2			; SKX-NEXT: vmovdqa %ymm1, %ymm2
	; SKX-NEXT: vpgatherdd (%rdi,%ymm0,4), %ymm2 {%k1}			; SKX-NEXT: vpgatherdd (%rdi,%ymm0,4), %ymm2 {%k1}
	; SKX-NEXT: vpaddd %ymm2, %ymm1, %ymm0			; SKX-NEXT: vpaddd %ymm2, %ymm1, %ymm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	;			;
	; SKX_32-LABEL: test7:			; SKX_32-LABEL: test7:
	; SKX_32: # BB#0:			; SKX_32: # BB#0:
	; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax			; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; SKX_32-NEXT: kmovb {{[0-9]+}}(%esp), %k1			; SKX_32-NEXT: kmovb {{[0-9]+}}(%esp), %k1
	; SKX_32-NEXT: kmovw %k1, %k2			; SKX_32-NEXT: kmovw %k1, %k2
	; SKX_32-NEXT: vpgatherdd (%eax,%ymm0,4), %ymm1 {%k2}			; SKX_32-NEXT: vpgatherdd (%eax,%ymm0,4), %ymm1 {%k2}
	; SKX_32-NEXT: vmovdqa64 %ymm1, %ymm2			; SKX_32-NEXT: vmovdqa %ymm1, %ymm2
	; SKX_32-NEXT: vpgatherdd (%eax,%ymm0,4), %ymm2 {%k1}			; SKX_32-NEXT: vpgatherdd (%eax,%ymm0,4), %ymm2 {%k1}
	; SKX_32-NEXT: vpaddd %ymm2, %ymm1, %ymm0			; SKX_32-NEXT: vpaddd %ymm2, %ymm1, %ymm0
	; SKX_32-NEXT: retl			; SKX_32-NEXT: retl

	%broadcast.splatinsert = insertelement <8 x i32> undef, i32 %base, i32 0			%broadcast.splatinsert = insertelement <8 x i32> undef, i32 %base, i32 0
	%broadcast.splat = shufflevector <8 x i32> %broadcast.splatinsert, <8 x i32> undef, <8 x i32> zeroinitializer			%broadcast.splat = shufflevector <8 x i32> %broadcast.splatinsert, <8 x i32> undef, <8 x i32> zeroinitializer

	%gep.random = getelementptr i32, <8 x i32*> %broadcast.splat, <8 x i32> %ind			%gep.random = getelementptr i32, <8 x i32*> %broadcast.splat, <8 x i32> %ind
	▲ Show 20 Lines • Show All 846 Lines • ▼ Show 20 Lines
	; KNL_32-NEXT: vmovdqa %xmm2, %xmm0			; KNL_32-NEXT: vmovdqa %xmm2, %xmm0
	; KNL_32-NEXT: retl			; KNL_32-NEXT: retl
	;			;
	; SKX-LABEL: test23:			; SKX-LABEL: test23:
	; SKX: # BB#0:			; SKX: # BB#0:
	; SKX-NEXT: vpsllq $63, %xmm1, %xmm1			; SKX-NEXT: vpsllq $63, %xmm1, %xmm1
	; SKX-NEXT: vptestmq %xmm1, %xmm1, %k1			; SKX-NEXT: vptestmq %xmm1, %xmm1, %k1
	; SKX-NEXT: vpgatherqq (%rdi,%xmm0,8), %xmm2 {%k1}			; SKX-NEXT: vpgatherqq (%rdi,%xmm0,8), %xmm2 {%k1}
	; SKX-NEXT: vmovdqa64 %xmm2, %xmm0			; SKX-NEXT: vmovdqa %xmm2, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	;			;
	; SKX_32-LABEL: test23:			; SKX_32-LABEL: test23:
	; SKX_32: # BB#0:			; SKX_32: # BB#0:
	; SKX_32-NEXT: vpsllq $63, %xmm1, %xmm1			; SKX_32-NEXT: vpsllq $63, %xmm1, %xmm1
	; SKX_32-NEXT: vptestmq %xmm1, %xmm1, %k1			; SKX_32-NEXT: vptestmq %xmm1, %xmm1, %k1
	; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax			; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; SKX_32-NEXT: vpgatherqq (%eax,%xmm0,8), %xmm2 {%k1}			; SKX_32-NEXT: vpgatherqq (%eax,%xmm0,8), %xmm2 {%k1}
	; SKX_32-NEXT: vmovdqa64 %xmm2, %xmm0			; SKX_32-NEXT: vmovdqa %xmm2, %xmm0
	; SKX_32-NEXT: retl			; SKX_32-NEXT: retl
	%sext_ind = sext <2 x i32> %ind to <2 x i64>			%sext_ind = sext <2 x i32> %ind to <2 x i64>
	%gep.random = getelementptr i32, i32* %base, <2 x i64> %sext_ind			%gep.random = getelementptr i32, i32* %base, <2 x i64> %sext_ind
	%res = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %gep.random, i32 4, <2 x i1> %mask, <2 x i32> %src0)			%res = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %gep.random, i32 4, <2 x i1> %mask, <2 x i32> %src0)
	ret <2 x i32>%res			ret <2 x i32>%res
	}			}

	define <2 x i32> @test24(i32* %base, <2 x i32> %ind) {			define <2 x i32> @test24(i32* %base, <2 x i32> %ind) {
	Show All 17 Lines
	; KNL_32-NEXT: vpgatherqq (%eax,%zmm0,8), %zmm1 {%k1}			; KNL_32-NEXT: vpgatherqq (%eax,%zmm0,8), %zmm1 {%k1}
	; KNL_32-NEXT: vmovdqa %xmm1, %xmm0			; KNL_32-NEXT: vmovdqa %xmm1, %xmm0
	; KNL_32-NEXT: retl			; KNL_32-NEXT: retl
	;			;
	; SKX-LABEL: test24:			; SKX-LABEL: test24:
	; SKX: # BB#0:			; SKX: # BB#0:
	; SKX-NEXT: kxnorw %k0, %k0, %k1			; SKX-NEXT: kxnorw %k0, %k0, %k1
	; SKX-NEXT: vpgatherqq (%rdi,%xmm0,8), %xmm1 {%k1}			; SKX-NEXT: vpgatherqq (%rdi,%xmm0,8), %xmm1 {%k1}
	; SKX-NEXT: vmovdqa64 %xmm1, %xmm0			; SKX-NEXT: vmovdqa %xmm1, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	;			;
	; SKX_32-LABEL: test24:			; SKX_32-LABEL: test24:
	; SKX_32: # BB#0:			; SKX_32: # BB#0:
	; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax			; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; SKX_32-NEXT: kxnorw %k0, %k0, %k1			; SKX_32-NEXT: kxnorw %k0, %k0, %k1
	; SKX_32-NEXT: vpgatherqq (%eax,%xmm0,8), %xmm1 {%k1}			; SKX_32-NEXT: vpgatherqq (%eax,%xmm0,8), %xmm1 {%k1}
	; SKX_32-NEXT: vmovdqa64 %xmm1, %xmm0			; SKX_32-NEXT: vmovdqa %xmm1, %xmm0
	; SKX_32-NEXT: retl			; SKX_32-NEXT: retl
	%sext_ind = sext <2 x i32> %ind to <2 x i64>			%sext_ind = sext <2 x i32> %ind to <2 x i64>
	%gep.random = getelementptr i32, i32* %base, <2 x i64> %sext_ind			%gep.random = getelementptr i32, i32* %base, <2 x i64> %sext_ind
	%res = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %gep.random, i32 4, <2 x i1> <i1 true, i1 true>, <2 x i32> undef)			%res = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %gep.random, i32 4, <2 x i1> <i1 true, i1 true>, <2 x i32> undef)
	ret <2 x i32>%res			ret <2 x i32>%res
	}			}

	define <2 x i64> @test25(i64* %base, <2 x i32> %ind, <2 x i1> %mask, <2 x i64> %src0) {			define <2 x i64> @test25(i64* %base, <2 x i32> %ind, <2 x i1> %mask, <2 x i64> %src0) {
	Show All 23 Lines
	; KNL_32-NEXT: vmovdqa %xmm2, %xmm0			; KNL_32-NEXT: vmovdqa %xmm2, %xmm0
	; KNL_32-NEXT: retl			; KNL_32-NEXT: retl
	;			;
	; SKX-LABEL: test25:			; SKX-LABEL: test25:
	; SKX: # BB#0:			; SKX: # BB#0:
	; SKX-NEXT: vpsllq $63, %xmm1, %xmm1			; SKX-NEXT: vpsllq $63, %xmm1, %xmm1
	; SKX-NEXT: vptestmq %xmm1, %xmm1, %k1			; SKX-NEXT: vptestmq %xmm1, %xmm1, %k1
	; SKX-NEXT: vpgatherqq (%rdi,%xmm0,8), %xmm2 {%k1}			; SKX-NEXT: vpgatherqq (%rdi,%xmm0,8), %xmm2 {%k1}
	; SKX-NEXT: vmovdqa64 %xmm2, %xmm0			; SKX-NEXT: vmovdqa %xmm2, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	;			;
	; SKX_32-LABEL: test25:			; SKX_32-LABEL: test25:
	; SKX_32: # BB#0:			; SKX_32: # BB#0:
	; SKX_32-NEXT: vpsllq $63, %xmm1, %xmm1			; SKX_32-NEXT: vpsllq $63, %xmm1, %xmm1
	; SKX_32-NEXT: vptestmq %xmm1, %xmm1, %k1			; SKX_32-NEXT: vptestmq %xmm1, %xmm1, %k1
	; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax			; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; SKX_32-NEXT: vpgatherqq (%eax,%xmm0,8), %xmm2 {%k1}			; SKX_32-NEXT: vpgatherqq (%eax,%xmm0,8), %xmm2 {%k1}
	; SKX_32-NEXT: vmovdqa64 %xmm2, %xmm0			; SKX_32-NEXT: vmovdqa %xmm2, %xmm0
	; SKX_32-NEXT: retl			; SKX_32-NEXT: retl
	%sext_ind = sext <2 x i32> %ind to <2 x i64>			%sext_ind = sext <2 x i32> %ind to <2 x i64>
	%gep.random = getelementptr i64, i64* %base, <2 x i64> %sext_ind			%gep.random = getelementptr i64, i64* %base, <2 x i64> %sext_ind
	%res = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %gep.random, i32 8, <2 x i1> %mask, <2 x i64> %src0)			%res = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %gep.random, i32 8, <2 x i1> %mask, <2 x i64> %src0)
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	define <2 x i64> @test26(i64* %base, <2 x i32> %ind, <2 x i64> %src0) {			define <2 x i64> @test26(i64* %base, <2 x i32> %ind, <2 x i64> %src0) {
	Show All 20 Lines
	; KNL_32-NEXT: vpgatherqq (%eax,%zmm0,8), %zmm1 {%k1}			; KNL_32-NEXT: vpgatherqq (%eax,%zmm0,8), %zmm1 {%k1}
	; KNL_32-NEXT: vmovdqa %xmm1, %xmm0			; KNL_32-NEXT: vmovdqa %xmm1, %xmm0
	; KNL_32-NEXT: retl			; KNL_32-NEXT: retl
	;			;
	; SKX-LABEL: test26:			; SKX-LABEL: test26:
	; SKX: # BB#0:			; SKX: # BB#0:
	; SKX-NEXT: kxnorw %k0, %k0, %k1			; SKX-NEXT: kxnorw %k0, %k0, %k1
	; SKX-NEXT: vpgatherqq (%rdi,%xmm0,8), %xmm1 {%k1}			; SKX-NEXT: vpgatherqq (%rdi,%xmm0,8), %xmm1 {%k1}
	; SKX-NEXT: vmovdqa64 %xmm1, %xmm0			; SKX-NEXT: vmovdqa %xmm1, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	;			;
	; SKX_32-LABEL: test26:			; SKX_32-LABEL: test26:
	; SKX_32: # BB#0:			; SKX_32: # BB#0:
	; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax			; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; SKX_32-NEXT: kxnorw %k0, %k0, %k1			; SKX_32-NEXT: kxnorw %k0, %k0, %k1
	; SKX_32-NEXT: vpgatherqq (%eax,%xmm0,8), %xmm1 {%k1}			; SKX_32-NEXT: vpgatherqq (%eax,%xmm0,8), %xmm1 {%k1}
	; SKX_32-NEXT: vmovdqa64 %xmm1, %xmm0			; SKX_32-NEXT: vmovdqa %xmm1, %xmm0
	; SKX_32-NEXT: retl			; SKX_32-NEXT: retl
	%sext_ind = sext <2 x i32> %ind to <2 x i64>			%sext_ind = sext <2 x i32> %ind to <2 x i64>
	%gep.random = getelementptr i64, i64* %base, <2 x i64> %sext_ind			%gep.random = getelementptr i64, i64* %base, <2 x i64> %sext_ind
	%res = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %gep.random, i32 8, <2 x i1> <i1 true, i1 true>, <2 x i64> %src0)			%res = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %gep.random, i32 8, <2 x i1> <i1 true, i1 true>, <2 x i64> %src0)
	ret <2 x i64>%res			ret <2 x i64>%res
	}			}

	; Result type requires widening; all-ones mask			; Result type requires widening; all-ones mask
	▲ Show 20 Lines • Show All 707 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/masked_memop.ll

	Show All 21 Lines
	; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2			; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vmaskmovpd (%rdi), %xmm0, %xmm2			; AVX512F-NEXT: vmaskmovpd (%rdi), %xmm0, %xmm2
	; AVX512F-NEXT: vblendvpd %xmm0, %xmm2, %xmm1, %xmm0			; AVX512F-NEXT: vblendvpd %xmm0, %xmm2, %xmm1, %xmm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test6:			; SKX-LABEL: test6:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k1			; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k1
	; SKX-NEXT: vmovupd (%rdi), %xmm1 {%k1}			; SKX-NEXT: vmovupd (%rdi), %xmm1 {%k1}
	; SKX-NEXT: vmovapd %xmm1, %xmm0			; SKX-NEXT: vmovapd %xmm1, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <2 x i64> %trigger, zeroinitializer			%mask = icmp eq <2 x i64> %trigger, zeroinitializer
	%res = call <2 x double> @llvm.masked.load.v2f64.p0v2f64(<2 x double>* %addr, i32 4, <2 x i1>%mask, <2 x double>%dst)			%res = call <2 x double> @llvm.masked.load.v2f64.p0v2f64(<2 x double>* %addr, i32 4, <2 x i1>%mask, <2 x double>%dst)
	ret <2 x double> %res			ret <2 x double> %res
	}			}
	Show All 12 Lines
	; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2			; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm2			; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm2
	; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0			; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test7:			; SKX-LABEL: test7:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpcmpeqd %xmm2, %xmm0, %k1			; SKX-NEXT: vpcmpeqd %xmm2, %xmm0, %k1
	; SKX-NEXT: vmovups (%rdi), %xmm1 {%k1}			; SKX-NEXT: vmovups (%rdi), %xmm1 {%k1}
	; SKX-NEXT: vmovaps %xmm1, %xmm0			; SKX-NEXT: vmovaps %xmm1, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <4 x i32> %trigger, zeroinitializer			%mask = icmp eq <4 x i32> %trigger, zeroinitializer
	%res = call <4 x float> @llvm.masked.load.v4f32.p0v4f32(<4 x float>* %addr, i32 4, <4 x i1>%mask, <4 x float>%dst)			%res = call <4 x float> @llvm.masked.load.v4f32.p0v4f32(<4 x float>* %addr, i32 4, <4 x i1>%mask, <4 x float>%dst)
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	Show All 20 Lines
	; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2			; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vpmaskmovd (%rdi), %xmm0, %xmm2			; AVX512F-NEXT: vpmaskmovd (%rdi), %xmm0, %xmm2
	; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0			; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test8:			; SKX-LABEL: test8:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpcmpeqd %xmm2, %xmm0, %k1			; SKX-NEXT: vpcmpeqd %xmm2, %xmm0, %k1
	; SKX-NEXT: vmovdqu32 (%rdi), %xmm1 {%k1}			; SKX-NEXT: vmovdqu32 (%rdi), %xmm1 {%k1}
	; SKX-NEXT: vmovdqa64 %xmm1, %xmm0			; SKX-NEXT: vmovdqa %xmm1, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <4 x i32> %trigger, zeroinitializer			%mask = icmp eq <4 x i32> %trigger, zeroinitializer
	%res = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %addr, i32 4, <4 x i1>%mask, <4 x i32>%dst)			%res = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %addr, i32 4, <4 x i1>%mask, <4 x i32>%dst)
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}

	define void @test9(<4 x i32> %trigger, <4 x i32>* %addr, <4 x i32> %val) {			define void @test9(<4 x i32> %trigger, <4 x i32>* %addr, <4 x i32> %val) {
	; AVX1-LABEL: test9:			; AVX1-LABEL: test9:
	Show All 14 Lines
	; AVX512F: ## BB#0:			; AVX512F: ## BB#0:
	; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2			; AVX512F-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vpmaskmovd %xmm1, %xmm0, (%rdi)			; AVX512F-NEXT: vpmaskmovd %xmm1, %xmm0, (%rdi)
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test9:			; SKX-LABEL: test9:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpcmpeqd %xmm2, %xmm0, %k1			; SKX-NEXT: vpcmpeqd %xmm2, %xmm0, %k1
	; SKX-NEXT: vmovdqu32 %xmm1, (%rdi) {%k1}			; SKX-NEXT: vmovdqu32 %xmm1, (%rdi) {%k1}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <4 x i32> %trigger, zeroinitializer			%mask = icmp eq <4 x i32> %trigger, zeroinitializer
	call void @llvm.masked.store.v4i32.p0v4i32(<4 x i32>%val, <4 x i32>* %addr, i32 4, <4 x i1>%mask)			call void @llvm.masked.store.v4i32.p0v4i32(<4 x i32>%val, <4 x i32>* %addr, i32 4, <4 x i1>%mask)
	ret void			ret void
	}			}

	Show All 25 Lines
	; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqd %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vpmovsxdq %xmm0, %ymm0			; AVX512F-NEXT: vpmovsxdq %xmm0, %ymm0
	; AVX512F-NEXT: vmaskmovpd (%rdi), %ymm0, %ymm2			; AVX512F-NEXT: vmaskmovpd (%rdi), %ymm0, %ymm2
	; AVX512F-NEXT: vblendvpd %ymm0, %ymm2, %ymm1, %ymm0			; AVX512F-NEXT: vblendvpd %ymm0, %ymm2, %ymm1, %ymm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test10:			; SKX-LABEL: test10:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpcmpeqd %xmm2, %xmm0, %k1			; SKX-NEXT: vpcmpeqd %xmm2, %xmm0, %k1
	; SKX-NEXT: vmovapd (%rdi), %ymm1 {%k1}			; SKX-NEXT: vmovapd (%rdi), %ymm1 {%k1}
	; SKX-NEXT: vmovapd %ymm1, %ymm0			; SKX-NEXT: vmovapd %ymm1, %ymm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <4 x i32> %trigger, zeroinitializer			%mask = icmp eq <4 x i32> %trigger, zeroinitializer
	%res = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* %addr, i32 32, <4 x i1>%mask, <4 x double>%dst)			%res = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* %addr, i32 32, <4 x i1>%mask, <4 x double>%dst)
	ret <4 x double> %res			ret <4 x double> %res
	}			}
	Show All 23 Lines
	; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512F-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
	; AVX512F-NEXT: vpmovsxdq %xmm0, %ymm0			; AVX512F-NEXT: vpmovsxdq %xmm0, %ymm0
	; AVX512F-NEXT: vmaskmovpd (%rdi), %ymm0, %ymm0			; AVX512F-NEXT: vmaskmovpd (%rdi), %ymm0, %ymm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test10b:			; SKX-LABEL: test10b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm1, %xmm1, %xmm1			; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; SKX-NEXT: vpcmpeqd %xmm1, %xmm0, %k1			; SKX-NEXT: vpcmpeqd %xmm1, %xmm0, %k1
	; SKX-NEXT: vmovapd (%rdi), %ymm0 {%k1} {z}			; SKX-NEXT: vmovapd (%rdi), %ymm0 {%k1} {z}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <4 x i32> %trigger, zeroinitializer			%mask = icmp eq <4 x i32> %trigger, zeroinitializer
	%res = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* %addr, i32 32, <4 x i1>%mask, <4 x double>zeroinitializer)			%res = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* %addr, i32 32, <4 x i1>%mask, <4 x double>zeroinitializer)
	ret <4 x double> %res			ret <4 x double> %res
	}			}

	Show All 26 Lines
	; AVX512F-NEXT: kshiftlw $8, %k0, %k0			; AVX512F-NEXT: kshiftlw $8, %k0, %k0
	; AVX512F-NEXT: kshiftrw $8, %k0, %k1			; AVX512F-NEXT: kshiftrw $8, %k0, %k1
	; AVX512F-NEXT: vmovups (%rdi), %zmm1 {%k1}			; AVX512F-NEXT: vmovups (%rdi), %zmm1 {%k1}
	; AVX512F-NEXT: vmovaps %ymm1, %ymm0			; AVX512F-NEXT: vmovaps %ymm1, %ymm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test11a:			; SKX-LABEL: test11a:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %ymm2, %ymm2, %ymm2			; SKX-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; SKX-NEXT: vpcmpeqd %ymm2, %ymm0, %k1			; SKX-NEXT: vpcmpeqd %ymm2, %ymm0, %k1
	; SKX-NEXT: vmovaps (%rdi), %ymm1 {%k1}			; SKX-NEXT: vmovaps (%rdi), %ymm1 {%k1}
	; SKX-NEXT: vmovaps %ymm1, %ymm0			; SKX-NEXT: vmovaps %ymm1, %ymm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <8 x i32> %trigger, zeroinitializer			%mask = icmp eq <8 x i32> %trigger, zeroinitializer
	%res = call <8 x float> @llvm.masked.load.v8f32.p0v8f32(<8 x float>* %addr, i32 32, <8 x i1>%mask, <8 x float>%dst)			%res = call <8 x float> @llvm.masked.load.v8f32.p0v8f32(<8 x float>* %addr, i32 32, <8 x i1>%mask, <8 x float>%dst)
	ret <8 x float> %res			ret <8 x float> %res
	}			}
	Show All 33 Lines
	; AVX512F-NEXT: vmovdqa %ymm1, %ymm0			; AVX512F-NEXT: vmovdqa %ymm1, %ymm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test11b:			; SKX-LABEL: test11b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsllw $15, %xmm0, %xmm0			; SKX-NEXT: vpsllw $15, %xmm0, %xmm0
	; SKX-NEXT: vpmovw2m %xmm0, %k1			; SKX-NEXT: vpmovw2m %xmm0, %k1
	; SKX-NEXT: vmovdqu32 (%rdi), %ymm1 {%k1}			; SKX-NEXT: vmovdqu32 (%rdi), %ymm1 {%k1}
	; SKX-NEXT: vmovdqa64 %ymm1, %ymm0			; SKX-NEXT: vmovdqa %ymm1, %ymm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%res = call <8 x i32> @llvm.masked.load.v8i32.p0v8i32(<8 x i32>* %addr, i32 4, <8 x i1>%mask, <8 x i32>%dst)			%res = call <8 x i32> @llvm.masked.load.v8i32.p0v8i32(<8 x i32>* %addr, i32 4, <8 x i1>%mask, <8 x i32>%dst)
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}

	define <8 x float> @test11c(<8 x i1> %mask, <8 x float>* %addr) {			define <8 x float> @test11c(<8 x i1> %mask, <8 x float>* %addr) {
	; AVX1-LABEL: test11c:			; AVX1-LABEL: test11c:
	; AVX1: ## BB#0:			; AVX1: ## BB#0:
	▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vpcmpeqd %zmm2, %zmm0, %k0			; AVX512F-NEXT: vpcmpeqd %zmm2, %zmm0, %k0
	; AVX512F-NEXT: kshiftlw $8, %k0, %k0			; AVX512F-NEXT: kshiftlw $8, %k0, %k0
	; AVX512F-NEXT: kshiftrw $8, %k0, %k1			; AVX512F-NEXT: kshiftrw $8, %k0, %k1
	; AVX512F-NEXT: vmovdqu32 %zmm1, (%rdi) {%k1}			; AVX512F-NEXT: vmovdqu32 %zmm1, (%rdi) {%k1}
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test12:			; SKX-LABEL: test12:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %ymm2, %ymm2, %ymm2			; SKX-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; SKX-NEXT: vpcmpeqd %ymm2, %ymm0, %k1			; SKX-NEXT: vpcmpeqd %ymm2, %ymm0, %k1
	; SKX-NEXT: vmovdqu32 %ymm1, (%rdi) {%k1}			; SKX-NEXT: vmovdqu32 %ymm1, (%rdi) {%k1}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <8 x i32> %trigger, zeroinitializer			%mask = icmp eq <8 x i32> %trigger, zeroinitializer
	call void @llvm.masked.store.v8i32.p0v8i32(<8 x i32>%val, <8 x i32>* %addr, i32 4, <8 x i1>%mask)			call void @llvm.masked.store.v8i32.p0v8i32(<8 x i32>%val, <8 x i32>* %addr, i32 4, <8 x i1>%mask)
	ret void			ret void
	}			}

	Show All 22 Lines
	; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]			; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
	; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero			; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
	; AVX512F-NEXT: vmaskmovps %xmm1, %xmm0, (%rdi)			; AVX512F-NEXT: vmaskmovps %xmm1, %xmm0, (%rdi)
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test14:			; SKX-LABEL: test14:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]			; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
	; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k0			; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k0
	; SKX-NEXT: kshiftlw $14, %k0, %k0			; SKX-NEXT: kshiftlw $14, %k0, %k0
	; SKX-NEXT: kshiftrw $14, %k0, %k1			; SKX-NEXT: kshiftrw $14, %k0, %k1
	; SKX-NEXT: vmovups %xmm1, (%rdi) {%k1}			; SKX-NEXT: vmovups %xmm1, (%rdi) {%k1}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <2 x i32> %trigger, zeroinitializer			%mask = icmp eq <2 x i32> %trigger, zeroinitializer
	call void @llvm.masked.store.v2f32.p0v2f32(<2 x float>%val, <2 x float>* %addr, i32 4, <2 x i1>%mask)			call void @llvm.masked.store.v2f32.p0v2f32(<2 x float>%val, <2 x float>* %addr, i32 4, <2 x i1>%mask)
	Show All 28 Lines
	; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero			; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
	; AVX512F-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]			; AVX512F-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]
	; AVX512F-NEXT: vpmaskmovd %xmm1, %xmm0, (%rdi)			; AVX512F-NEXT: vpmaskmovd %xmm1, %xmm0, (%rdi)
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test15:			; SKX-LABEL: test15:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]			; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
	; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k1			; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k1
	; SKX-NEXT: vpmovqd %xmm1, (%rdi) {%k1}			; SKX-NEXT: vpmovqd %xmm1, (%rdi) {%k1}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <2 x i32> %trigger, zeroinitializer			%mask = icmp eq <2 x i32> %trigger, zeroinitializer
	call void @llvm.masked.store.v2i32.p0v2i32(<2 x i32>%val, <2 x i32>* %addr, i32 4, <2 x i1>%mask)			call void @llvm.masked.store.v2i32.p0v2i32(<2 x i32>%val, <2 x i32>* %addr, i32 4, <2 x i1>%mask)
	ret void			ret void
	}			}
	Show All 26 Lines
	; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqq %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero			; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
	; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm2			; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm2
	; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0			; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test16:			; SKX-LABEL: test16:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]			; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
	; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k0			; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k0
	; SKX-NEXT: kshiftlw $14, %k0, %k0			; SKX-NEXT: kshiftlw $14, %k0, %k0
	; SKX-NEXT: kshiftrw $14, %k0, %k1			; SKX-NEXT: kshiftrw $14, %k0, %k1
	; SKX-NEXT: vmovups (%rdi), %xmm1 {%k1}			; SKX-NEXT: vmovups (%rdi), %xmm1 {%k1}
	; SKX-NEXT: vmovaps %xmm1, %xmm0			; SKX-NEXT: vmovaps %xmm1, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <2 x i32> %trigger, zeroinitializer			%mask = icmp eq <2 x i32> %trigger, zeroinitializer
	Show All 35 Lines
	; AVX512F-NEXT: vpmaskmovd (%rdi), %xmm0, %xmm2			; AVX512F-NEXT: vpmaskmovd (%rdi), %xmm0, %xmm2
	; AVX512F-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]			; AVX512F-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]
	; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0			; AVX512F-NEXT: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
	; AVX512F-NEXT: vpmovsxdq %xmm0, %xmm0			; AVX512F-NEXT: vpmovsxdq %xmm0, %xmm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test17:			; SKX-LABEL: test17:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm2, %xmm2, %xmm2			; SKX-NEXT: vpxor %xmm2, %xmm2, %xmm2
	; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]			; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm2[1],xmm0[2],xmm2[3]
	; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k0			; SKX-NEXT: vpcmpeqq %xmm2, %xmm0, %k0
	; SKX-NEXT: kshiftlw $14, %k0, %k0			; SKX-NEXT: kshiftlw $14, %k0, %k0
	; SKX-NEXT: kshiftrw $14, %k0, %k1			; SKX-NEXT: kshiftrw $14, %k0, %k1
	; SKX-NEXT: vpshufd {{.*#+}} xmm0 = xmm1[0,2,2,3]			; SKX-NEXT: vpshufd {{.*#+}} xmm0 = xmm1[0,2,2,3]
	; SKX-NEXT: vmovdqu32 (%rdi), %xmm0 {%k1}			; SKX-NEXT: vmovdqu32 (%rdi), %xmm0 {%k1}
	; SKX-NEXT: vpmovsxdq %xmm0, %xmm0			; SKX-NEXT: vpmovsxdq %xmm0, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	Show All 27 Lines
	; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]			; AVX512F-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]
	; AVX512F-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0			; AVX512F-NEXT: vpcmpeqq %xmm1, %xmm0, %xmm0
	; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero			; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
	; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm0			; AVX512F-NEXT: vmaskmovps (%rdi), %xmm0, %xmm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; SKX-LABEL: test18:			; SKX-LABEL: test18:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpxord %xmm1, %xmm1, %xmm1			; SKX-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]			; SKX-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3]
	; SKX-NEXT: vpcmpeqq %xmm1, %xmm0, %k0			; SKX-NEXT: vpcmpeqq %xmm1, %xmm0, %k0
	; SKX-NEXT: kshiftlw $14, %k0, %k0			; SKX-NEXT: kshiftlw $14, %k0, %k0
	; SKX-NEXT: kshiftrw $14, %k0, %k1			; SKX-NEXT: kshiftrw $14, %k0, %k1
	; SKX-NEXT: vmovups (%rdi), %xmm0 {%k1} {z}			; SKX-NEXT: vmovups (%rdi), %xmm0 {%k1} {z}
	; SKX-NEXT: retq			; SKX-NEXT: retq
	%mask = icmp eq <2 x i32> %trigger, zeroinitializer			%mask = icmp eq <2 x i32> %trigger, zeroinitializer
	%res = call <2 x float> @llvm.masked.load.v2f32.p0v2f32(<2 x float>* %addr, i32 4, <2 x i1>%mask, <2 x float>undef)			%res = call <2 x float> @llvm.masked.load.v2f32.p0v2f32(<2 x float>* %addr, i32 4, <2 x i1>%mask, <2 x float>undef)
	▲ Show 20 Lines • Show All 538 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/nontemporal-2.ll

	Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines
	; AVX-LABEL: test_zero_v4f32:			; AVX-LABEL: test_zero_v4f32:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX-NEXT: vmovntps %xmm0, (%rdi)			; AVX-NEXT: vmovntps %xmm0, (%rdi)
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v4f32:			; VLX-LABEL: test_zero_v4f32:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %xmm0, %xmm0, %xmm0			; VLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; VLX-NEXT: vmovntdq %xmm0, (%rdi)			; VLX-NEXT: vmovntdq %xmm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <4 x float> zeroinitializer, <4 x float>* %dst, align 16, !nontemporal !1			store <4 x float> zeroinitializer, <4 x float>* %dst, align 16, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v4i32(<4 x i32>* %dst) {			define void @test_zero_v4i32(<4 x i32>* %dst) {
	; SSE-LABEL: test_zero_v4i32:			; SSE-LABEL: test_zero_v4i32:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v4i32:			; AVX-LABEL: test_zero_v4i32:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX-NEXT: vmovntps %xmm0, (%rdi)			; AVX-NEXT: vmovntps %xmm0, (%rdi)
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v4i32:			; VLX-LABEL: test_zero_v4i32:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %xmm0, %xmm0, %xmm0			; VLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; VLX-NEXT: vmovntdq %xmm0, (%rdi)			; VLX-NEXT: vmovntdq %xmm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <4 x i32> zeroinitializer, <4 x i32>* %dst, align 16, !nontemporal !1			store <4 x i32> zeroinitializer, <4 x i32>* %dst, align 16, !nontemporal !1
	store <4 x i32> zeroinitializer, <4 x i32>* %dst, align 16, !nontemporal !1			store <4 x i32> zeroinitializer, <4 x i32>* %dst, align 16, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v2f64(<2 x double>* %dst) {			define void @test_zero_v2f64(<2 x double>* %dst) {
	; SSE-LABEL: test_zero_v2f64:			; SSE-LABEL: test_zero_v2f64:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v2f64:			; AVX-LABEL: test_zero_v2f64:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX-NEXT: vmovntps %xmm0, (%rdi)			; AVX-NEXT: vmovntps %xmm0, (%rdi)
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v2f64:			; VLX-LABEL: test_zero_v2f64:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %xmm0, %xmm0, %xmm0			; VLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; VLX-NEXT: vmovntdq %xmm0, (%rdi)			; VLX-NEXT: vmovntdq %xmm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <2 x double> zeroinitializer, <2 x double>* %dst, align 16, !nontemporal !1			store <2 x double> zeroinitializer, <2 x double>* %dst, align 16, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v2i64(<2 x i64>* %dst) {			define void @test_zero_v2i64(<2 x i64>* %dst) {
	; SSE-LABEL: test_zero_v2i64:			; SSE-LABEL: test_zero_v2i64:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v2i64:			; AVX-LABEL: test_zero_v2i64:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX-NEXT: vmovntps %xmm0, (%rdi)			; AVX-NEXT: vmovntps %xmm0, (%rdi)
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v2i64:			; VLX-LABEL: test_zero_v2i64:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %xmm0, %xmm0, %xmm0			; VLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; VLX-NEXT: vmovntdq %xmm0, (%rdi)			; VLX-NEXT: vmovntdq %xmm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <2 x i64> zeroinitializer, <2 x i64>* %dst, align 16, !nontemporal !1			store <2 x i64> zeroinitializer, <2 x i64>* %dst, align 16, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v8i16(<8 x i16>* %dst) {			define void @test_zero_v8i16(<8 x i16>* %dst) {
	; SSE-LABEL: test_zero_v8i16:			; SSE-LABEL: test_zero_v8i16:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v8i16:			; AVX-LABEL: test_zero_v8i16:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX-NEXT: vmovntps %xmm0, (%rdi)			; AVX-NEXT: vmovntps %xmm0, (%rdi)
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v8i16:			; VLX-LABEL: test_zero_v8i16:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %xmm0, %xmm0, %xmm0			; VLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; VLX-NEXT: vmovntdq %xmm0, (%rdi)			; VLX-NEXT: vmovntdq %xmm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <8 x i16> zeroinitializer, <8 x i16>* %dst, align 16, !nontemporal !1			store <8 x i16> zeroinitializer, <8 x i16>* %dst, align 16, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v16i8(<16 x i8>* %dst) {			define void @test_zero_v16i8(<16 x i8>* %dst) {
	; SSE-LABEL: test_zero_v16i8:			; SSE-LABEL: test_zero_v16i8:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v16i8:			; AVX-LABEL: test_zero_v16i8:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX-NEXT: vmovntps %xmm0, (%rdi)			; AVX-NEXT: vmovntps %xmm0, (%rdi)
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v16i8:			; VLX-LABEL: test_zero_v16i8:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %xmm0, %xmm0, %xmm0			; VLX-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; VLX-NEXT: vmovntdq %xmm0, (%rdi)			; VLX-NEXT: vmovntdq %xmm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <16 x i8> zeroinitializer, <16 x i8>* %dst, align 16, !nontemporal !1			store <16 x i8> zeroinitializer, <16 x i8>* %dst, align 16, !nontemporal !1
	ret void			ret void
	}			}

	; And now YMM versions.			; And now YMM versions.

	Show All 9 Lines
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0			; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; AVX-NEXT: vmovntps %ymm0, (%rdi)			; AVX-NEXT: vmovntps %ymm0, (%rdi)
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v8f32:			; VLX-LABEL: test_zero_v8f32:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %ymm0, %ymm0, %ymm0			; VLX-NEXT: vpxor %ymm0, %ymm0, %ymm0
	; VLX-NEXT: vmovntdq %ymm0, (%rdi)			; VLX-NEXT: vmovntdq %ymm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <8 x float> zeroinitializer, <8 x float>* %dst, align 32, !nontemporal !1			store <8 x float> zeroinitializer, <8 x float>* %dst, align 32, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v8i32(<8 x i32>* %dst) {			define void @test_zero_v8i32(<8 x i32>* %dst) {
	; SSE-LABEL: test_zero_v8i32:			; SSE-LABEL: test_zero_v8i32:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, 16(%rdi)			; SSE-NEXT: movntps %xmm0, 16(%rdi)
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v8i32:			; AVX-LABEL: test_zero_v8i32:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0			; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; AVX-NEXT: vmovntps %ymm0, (%rdi)			; AVX-NEXT: vmovntps %ymm0, (%rdi)
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v8i32:			; VLX-LABEL: test_zero_v8i32:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %ymm0, %ymm0, %ymm0			; VLX-NEXT: vpxor %ymm0, %ymm0, %ymm0
	; VLX-NEXT: vmovntdq %ymm0, (%rdi)			; VLX-NEXT: vmovntdq %ymm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <8 x i32> zeroinitializer, <8 x i32>* %dst, align 32, !nontemporal !1			store <8 x i32> zeroinitializer, <8 x i32>* %dst, align 32, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v4f64(<4 x double>* %dst) {			define void @test_zero_v4f64(<4 x double>* %dst) {
	; SSE-LABEL: test_zero_v4f64:			; SSE-LABEL: test_zero_v4f64:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, 16(%rdi)			; SSE-NEXT: movntps %xmm0, 16(%rdi)
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v4f64:			; AVX-LABEL: test_zero_v4f64:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0			; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; AVX-NEXT: vmovntps %ymm0, (%rdi)			; AVX-NEXT: vmovntps %ymm0, (%rdi)
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v4f64:			; VLX-LABEL: test_zero_v4f64:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %ymm0, %ymm0, %ymm0			; VLX-NEXT: vpxor %ymm0, %ymm0, %ymm0
	; VLX-NEXT: vmovntdq %ymm0, (%rdi)			; VLX-NEXT: vmovntdq %ymm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <4 x double> zeroinitializer, <4 x double>* %dst, align 32, !nontemporal !1			store <4 x double> zeroinitializer, <4 x double>* %dst, align 32, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v4i64(<4 x i64>* %dst) {			define void @test_zero_v4i64(<4 x i64>* %dst) {
	; SSE-LABEL: test_zero_v4i64:			; SSE-LABEL: test_zero_v4i64:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, 16(%rdi)			; SSE-NEXT: movntps %xmm0, 16(%rdi)
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v4i64:			; AVX-LABEL: test_zero_v4i64:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0			; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; AVX-NEXT: vmovntps %ymm0, (%rdi)			; AVX-NEXT: vmovntps %ymm0, (%rdi)
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v4i64:			; VLX-LABEL: test_zero_v4i64:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %ymm0, %ymm0, %ymm0			; VLX-NEXT: vpxor %ymm0, %ymm0, %ymm0
	; VLX-NEXT: vmovntdq %ymm0, (%rdi)			; VLX-NEXT: vmovntdq %ymm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <4 x i64> zeroinitializer, <4 x i64>* %dst, align 32, !nontemporal !1			store <4 x i64> zeroinitializer, <4 x i64>* %dst, align 32, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v16i16(<16 x i16>* %dst) {			define void @test_zero_v16i16(<16 x i16>* %dst) {
	; SSE-LABEL: test_zero_v16i16:			; SSE-LABEL: test_zero_v16i16:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, 16(%rdi)			; SSE-NEXT: movntps %xmm0, 16(%rdi)
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v16i16:			; AVX-LABEL: test_zero_v16i16:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0			; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; AVX-NEXT: vmovntps %ymm0, (%rdi)			; AVX-NEXT: vmovntps %ymm0, (%rdi)
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v16i16:			; VLX-LABEL: test_zero_v16i16:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %ymm0, %ymm0, %ymm0			; VLX-NEXT: vpxor %ymm0, %ymm0, %ymm0
	; VLX-NEXT: vmovntdq %ymm0, (%rdi)			; VLX-NEXT: vmovntdq %ymm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <16 x i16> zeroinitializer, <16 x i16>* %dst, align 32, !nontemporal !1			store <16 x i16> zeroinitializer, <16 x i16>* %dst, align 32, !nontemporal !1
	ret void			ret void
	}			}

	define void @test_zero_v32i8(<32 x i8>* %dst) {			define void @test_zero_v32i8(<32 x i8>* %dst) {
	; SSE-LABEL: test_zero_v32i8:			; SSE-LABEL: test_zero_v32i8:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: xorps %xmm0, %xmm0			; SSE-NEXT: xorps %xmm0, %xmm0
	; SSE-NEXT: movntps %xmm0, 16(%rdi)			; SSE-NEXT: movntps %xmm0, 16(%rdi)
	; SSE-NEXT: movntps %xmm0, (%rdi)			; SSE-NEXT: movntps %xmm0, (%rdi)
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_zero_v32i8:			; AVX-LABEL: test_zero_v32i8:
	; AVX: # BB#0:			; AVX: # BB#0:
	; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0			; AVX-NEXT: vxorps %ymm0, %ymm0, %ymm0
	; AVX-NEXT: vmovntps %ymm0, (%rdi)			; AVX-NEXT: vmovntps %ymm0, (%rdi)
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; VLX-LABEL: test_zero_v32i8:			; VLX-LABEL: test_zero_v32i8:
	; VLX: # BB#0:			; VLX: # BB#0:
	; VLX-NEXT: vpxord %ymm0, %ymm0, %ymm0			; VLX-NEXT: vpxor %ymm0, %ymm0, %ymm0
	; VLX-NEXT: vmovntdq %ymm0, (%rdi)			; VLX-NEXT: vmovntdq %ymm0, (%rdi)
	; VLX-NEXT: retq			; VLX-NEXT: retq
	store <32 x i8> zeroinitializer, <32 x i8>* %dst, align 32, !nontemporal !1			store <32 x i8> zeroinitializer, <32 x i8>* %dst, align 32, !nontemporal !1
	ret void			ret void
	}			}


	; Check that we also handle arguments. Here the type survives longer.			; Check that we also handle arguments. Here the type survives longer.
	▲ Show 20 Lines • Show All 860 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sse-intrinsics-x86.ll

	Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX2-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX2-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX2-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX2-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX2-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_comieq_ss:			; SKX-LABEL: test_x86_sse_comieq_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc1]			; SKX-NEXT: vcomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc1]
	; SKX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; SKX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; SKX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; SKX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; SKX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; SKX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comieq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comieq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	Show All 13 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]			; AVX2-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_comige_ss:			; SKX-LABEL: test_x86_sse_comige_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc1]			; SKX-NEXT: vcomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc1]
	; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]			; AVX2-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2f,0xc1]
	; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_comigt_ss:			; SKX-LABEL: test_x86_sse_comigt_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc1]			; SKX-NEXT: vcomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc1]
	; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2f,0xc8]			; AVX2-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2f,0xc8]
	; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_comile_ss:			; SKX-LABEL: test_x86_sse_comile_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc8]			; SKX-NEXT: vcomiss %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc8]
	; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2f,0xc8]			; AVX2-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2f,0xc8]
	; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_comilt_ss:			; SKX-LABEL: test_x86_sse_comilt_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vcomiss %xmm0, %xmm1 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc8]			; SKX-NEXT: vcomiss %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc8]
	; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone


	Show All 13 Lines
	; AVX2-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX2-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX2-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX2-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX2-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX2-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_comineq_ss:			; SKX-LABEL: test_x86_sse_comineq_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2f,0xc1]			; SKX-NEXT: vcomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2f,0xc1]
	; SKX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; SKX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; SKX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; SKX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; SKX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; SKX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.comineq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.comineq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	Show All 11 Lines
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]			; AVX2-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]
	; AVX2-NEXT: vcvtsi2ssl %eax, %xmm0, %xmm0 ## encoding: [0xc5,0xfa,0x2a,0xc0]			; AVX2-NEXT: vcvtsi2ssl %eax, %xmm0, %xmm0 ## encoding: [0xc5,0xfa,0x2a,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_cvtsi2ss:			; SKX-LABEL: test_x86_sse_cvtsi2ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]			; SKX-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]
	; SKX-NEXT: vcvtsi2ssl %eax, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x2a,0xc0]			; SKX-NEXT: vcvtsi2ssl %eax, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x2a,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse.cvtsi2ss(<4 x float> %a0, i32 7) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse.cvtsi2ss(<4 x float> %a0, i32 7) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse.cvtsi2ss(<4 x float>, i32) nounwind readnone			declare <4 x float> @llvm.x86.sse.cvtsi2ss(<4 x float>, i32) nounwind readnone


	define i32 @test_x86_sse_cvtss2si(<4 x float> %a0) {			define i32 @test_x86_sse_cvtss2si(<4 x float> %a0) {
	; SSE-LABEL: test_x86_sse_cvtss2si:			; SSE-LABEL: test_x86_sse_cvtss2si:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvtss2si %xmm0, %eax ## encoding: [0xf3,0x0f,0x2d,0xc0]			; SSE-NEXT: cvtss2si %xmm0, %eax ## encoding: [0xf3,0x0f,0x2d,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse_cvtss2si:			; AVX2-LABEL: test_x86_sse_cvtss2si:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvtss2si %xmm0, %eax ## encoding: [0xc5,0xfa,0x2d,0xc0]			; AVX2-NEXT: vcvtss2si %xmm0, %eax ## encoding: [0xc5,0xfa,0x2d,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_cvtss2si:			; SKX-LABEL: test_x86_sse_cvtss2si:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvtss2si %xmm0, %eax ## encoding: [0x62,0xf1,0x7e,0x08,0x2d,0xc0]			; SKX-NEXT: vcvtss2si %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x2d,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.cvtss2si(<4 x float> %a0) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.cvtss2si(<4 x float> %a0) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.cvtss2si(<4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.cvtss2si(<4 x float>) nounwind readnone


	define i32 @test_x86_sse_cvttss2si(<4 x float> %a0) {			define i32 @test_x86_sse_cvttss2si(<4 x float> %a0) {
	; SSE-LABEL: test_x86_sse_cvttss2si:			; SSE-LABEL: test_x86_sse_cvttss2si:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvttss2si %xmm0, %eax ## encoding: [0xf3,0x0f,0x2c,0xc0]			; SSE-NEXT: cvttss2si %xmm0, %eax ## encoding: [0xf3,0x0f,0x2c,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse_cvttss2si:			; AVX2-LABEL: test_x86_sse_cvttss2si:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvttss2si %xmm0, %eax ## encoding: [0xc5,0xfa,0x2c,0xc0]			; AVX2-NEXT: vcvttss2si %xmm0, %eax ## encoding: [0xc5,0xfa,0x2c,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_cvttss2si:			; SKX-LABEL: test_x86_sse_cvttss2si:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvttss2si %xmm0, %eax ## encoding: [0x62,0xf1,0x7e,0x08,0x2c,0xc0]			; SKX-NEXT: vcvttss2si %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x2c,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.cvttss2si(<4 x float> %a0) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.cvttss2si(<4 x float> %a0) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.cvttss2si(<4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.cvttss2si(<4 x float>) nounwind readnone


	define void @test_x86_sse_ldmxcsr(i8* %a0) {			define void @test_x86_sse_ldmxcsr(i8* %a0) {
	Show All 23 Lines
	;			;
	; AVX2-LABEL: test_x86_sse_max_ps:			; AVX2-LABEL: test_x86_sse_max_ps:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5f,0xc1]			; AVX2-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5f,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_max_ps:			; SKX-LABEL: test_x86_sse_max_ps:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5f,0xc1]			; SKX-NEXT: vmaxps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5f,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse.max.ps(<4 x float> %a0, <4 x float> %a1) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse.max.ps(<4 x float> %a0, <4 x float> %a1) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse.max.ps(<4 x float>, <4 x float>) nounwind readnone			declare <4 x float> @llvm.x86.sse.max.ps(<4 x float>, <4 x float>) nounwind readnone


	define <4 x float> @test_x86_sse_max_ss(<4 x float> %a0, <4 x float> %a1) {			define <4 x float> @test_x86_sse_max_ss(<4 x float> %a0, <4 x float> %a1) {
	Show All 20 Lines
	;			;
	; AVX2-LABEL: test_x86_sse_min_ps:			; AVX2-LABEL: test_x86_sse_min_ps:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vminps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5d,0xc1]			; AVX2-NEXT: vminps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5d,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_min_ps:			; SKX-LABEL: test_x86_sse_min_ps:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vminps %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5d,0xc1]			; SKX-NEXT: vminps %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5d,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse.min.ps(<4 x float> %a0, <4 x float> %a1) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse.min.ps(<4 x float> %a0, <4 x float> %a1) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse.min.ps(<4 x float>, <4 x float>) nounwind readnone			declare <4 x float> @llvm.x86.sse.min.ps(<4 x float>, <4 x float>) nounwind readnone


	define <4 x float> @test_x86_sse_min_ss(<4 x float> %a0, <4 x float> %a1) {			define <4 x float> @test_x86_sse_min_ss(<4 x float> %a0, <4 x float> %a1) {
	▲ Show 20 Lines • Show All 169 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX2-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX2-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX2-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX2-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX2-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_ucomieq_ss:			; SKX-LABEL: test_x86_sse_ucomieq_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc1]			; SKX-NEXT: vucomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc1]
	; SKX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; SKX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; SKX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; SKX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; SKX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; SKX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomieq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomieq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	Show All 13 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]			; AVX2-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_ucomige_ss:			; SKX-LABEL: test_x86_sse_ucomige_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc1]			; SKX-NEXT: vucomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc1]
	; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]			; AVX2-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x2e,0xc1]
	; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_ucomigt_ss:			; SKX-LABEL: test_x86_sse_ucomigt_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc1]			; SKX-NEXT: vucomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc1]
	; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2e,0xc8]			; AVX2-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2e,0xc8]
	; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_ucomile_ss:			; SKX-LABEL: test_x86_sse_ucomile_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc8]			; SKX-NEXT: vucomiss %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc8]
	; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2e,0xc8]			; AVX2-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0xc5,0xf8,0x2e,0xc8]
	; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_ucomilt_ss:			; SKX-LABEL: test_x86_sse_ucomilt_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vucomiss %xmm0, %xmm1 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc8]			; SKX-NEXT: vucomiss %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc8]
	; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone


	Show All 13 Lines
	; AVX2-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX2-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX2-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX2-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX2-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX2-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse_ucomineq_ss:			; SKX-LABEL: test_x86_sse_ucomineq_ss:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vucomiss %xmm1, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x2e,0xc1]			; SKX-NEXT: vucomiss %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x2e,0xc1]
	; SKX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; SKX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; SKX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; SKX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; SKX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; SKX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse.ucomineq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse.ucomineq.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse.ucomineq.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomineq.ss(<4 x float>, <4 x float>) nounwind readnone

llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll

	Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX2-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX2-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX2-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX2-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX2-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_comieq_sd:			; SKX-LABEL: test_x86_sse2_comieq_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc1]			; SKX-NEXT: vcomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc1]
	; SKX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; SKX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; SKX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; SKX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; SKX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; SKX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comieq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comieq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	Show All 13 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]			; AVX2-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_comige_sd:			; SKX-LABEL: test_x86_sse2_comige_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc1]			; SKX-NEXT: vcomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc1]
	; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]			; AVX2-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2f,0xc1]
	; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_comigt_sd:			; SKX-LABEL: test_x86_sse2_comigt_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc1]			; SKX-NEXT: vcomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc1]
	; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2f,0xc8]			; AVX2-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2f,0xc8]
	; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_comile_sd:			; SKX-LABEL: test_x86_sse2_comile_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc8]			; SKX-NEXT: vcomisd %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc8]
	; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2f,0xc8]			; AVX2-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2f,0xc8]
	; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_comilt_sd:			; SKX-LABEL: test_x86_sse2_comilt_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vcomisd %xmm0, %xmm1 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc8]			; SKX-NEXT: vcomisd %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc8]
	; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone


	Show All 13 Lines
	; AVX2-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX2-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX2-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX2-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX2-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX2-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_comineq_sd:			; SKX-LABEL: test_x86_sse2_comineq_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2f,0xc1]			; SKX-NEXT: vcomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2f,0xc1]
	; SKX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; SKX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; SKX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; SKX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; SKX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; SKX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.comineq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.comineq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.comineq.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.comineq.sd(<2 x double>, <2 x double>) nounwind readnone


	define <4 x float> @test_x86_sse2_cvtdq2ps(<4 x i32> %a0) {			define <4 x float> @test_x86_sse2_cvtdq2ps(<4 x i32> %a0) {
	; SSE-LABEL: test_x86_sse2_cvtdq2ps:			; SSE-LABEL: test_x86_sse2_cvtdq2ps:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvtdq2ps %xmm0, %xmm0 ## encoding: [0x0f,0x5b,0xc0]			; SSE-NEXT: cvtdq2ps %xmm0, %xmm0 ## encoding: [0x0f,0x5b,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_cvtdq2ps:			; AVX2-LABEL: test_x86_sse2_cvtdq2ps:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvtdq2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5b,0xc0]			; AVX2-NEXT: vcvtdq2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x5b,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvtdq2ps:			; SKX-LABEL: test_x86_sse2_cvtdq2ps:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvtdq2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x5b,0xc0]			; SKX-NEXT: vcvtdq2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5b,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse2.cvtdq2ps(<4 x i32> %a0) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse2.cvtdq2ps(<4 x i32> %a0) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse2.cvtdq2ps(<4 x i32>) nounwind readnone			declare <4 x float> @llvm.x86.sse2.cvtdq2ps(<4 x i32>) nounwind readnone


	define <4 x i32> @test_x86_sse2_cvtpd2dq(<2 x double> %a0) {			define <4 x i32> @test_x86_sse2_cvtpd2dq(<2 x double> %a0) {
	; SSE-LABEL: test_x86_sse2_cvtpd2dq:			; SSE-LABEL: test_x86_sse2_cvtpd2dq:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvtpd2dq %xmm0, %xmm0 ## encoding: [0xf2,0x0f,0xe6,0xc0]			; SSE-NEXT: cvtpd2dq %xmm0, %xmm0 ## encoding: [0xf2,0x0f,0xe6,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_cvtpd2dq:			; AVX2-LABEL: test_x86_sse2_cvtpd2dq:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0xe6,0xc0]			; AVX2-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0xe6,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvtpd2dq:			; SKX-LABEL: test_x86_sse2_cvtpd2dq:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0xe6,0xc0]			; SKX-NEXT: vcvtpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0xe6,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>) nounwind readnone


	define <2 x i64> @test_mm_cvtpd_epi32_zext(<2 x double> %a0) nounwind {			define <2 x i64> @test_mm_cvtpd_epi32_zext(<2 x double> %a0) nounwind {
	; SSE-LABEL: test_mm_cvtpd_epi32_zext:			; SSE-LABEL: test_mm_cvtpd_epi32_zext:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvtpd2dq %xmm0, %xmm0 ## encoding: [0xf2,0x0f,0xe6,0xc0]			; SSE-NEXT: cvtpd2dq %xmm0, %xmm0 ## encoding: [0xf2,0x0f,0xe6,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_mm_cvtpd_epi32_zext:			; AVX2-LABEL: test_mm_cvtpd_epi32_zext:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0xe6,0xc0]			; AVX2-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0xe6,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_mm_cvtpd_epi32_zext:			; SKX-LABEL: test_mm_cvtpd_epi32_zext:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvtpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xff,0x08,0xe6,0xc0]			; SKX-NEXT: vcvtpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0xe6,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%cvt = call <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double> %a0)			%cvt = call <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double> %a0)
	%res = shufflevector <4 x i32> %cvt, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>			%res = shufflevector <4 x i32> %cvt, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
	%bc = bitcast <4 x i32> %res to <2 x i64>			%bc = bitcast <4 x i32> %res to <2 x i64>
	ret <2 x i64> %bc			ret <2 x i64> %bc
	}			}


	define <4 x float> @test_x86_sse2_cvtpd2ps(<2 x double> %a0) {			define <4 x float> @test_x86_sse2_cvtpd2ps(<2 x double> %a0) {
	; SSE-LABEL: test_x86_sse2_cvtpd2ps:			; SSE-LABEL: test_x86_sse2_cvtpd2ps:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvtpd2ps %xmm0, %xmm0 ## encoding: [0x66,0x0f,0x5a,0xc0]			; SSE-NEXT: cvtpd2ps %xmm0, %xmm0 ## encoding: [0x66,0x0f,0x5a,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_cvtpd2ps:			; AVX2-LABEL: test_x86_sse2_cvtpd2ps:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5a,0xc0]			; AVX2-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5a,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvtpd2ps:			; SKX-LABEL: test_x86_sse2_cvtpd2ps:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5a,0xc0]			; SKX-NEXT: vcvtpd2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5a,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double> %a0) ; <<4 x float>> [#uses=1]			%res = call <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double> %a0) ; <<4 x float>> [#uses=1]
	ret <4 x float> %res			ret <4 x float> %res
	}			}
	declare <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double>) nounwind readnone			declare <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double>) nounwind readnone

	define <4 x float> @test_x86_sse2_cvtpd2ps_zext(<2 x double> %a0) nounwind {			define <4 x float> @test_x86_sse2_cvtpd2ps_zext(<2 x double> %a0) nounwind {
	; SSE-LABEL: test_x86_sse2_cvtpd2ps_zext:			; SSE-LABEL: test_x86_sse2_cvtpd2ps_zext:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvtpd2ps %xmm0, %xmm0 ## encoding: [0x66,0x0f,0x5a,0xc0]			; SSE-NEXT: cvtpd2ps %xmm0, %xmm0 ## encoding: [0x66,0x0f,0x5a,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_cvtpd2ps_zext:			; AVX2-LABEL: test_x86_sse2_cvtpd2ps_zext:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5a,0xc0]			; AVX2-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5a,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvtpd2ps_zext:			; SKX-LABEL: test_x86_sse2_cvtpd2ps_zext:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvtpd2ps %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5a,0xc0]			; SKX-NEXT: vcvtpd2ps %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5a,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%cvt = call <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double> %a0)			%cvt = call <4 x float> @llvm.x86.sse2.cvtpd2ps(<2 x double> %a0)
	%res = shufflevector <4 x float> %cvt, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>			%res = shufflevector <4 x float> %cvt, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
	ret <4 x float> %res			ret <4 x float> %res
	}			}

	define <4 x i32> @test_x86_sse2_cvtps2dq(<4 x float> %a0) {			define <4 x i32> @test_x86_sse2_cvtps2dq(<4 x float> %a0) {
	; SSE-LABEL: test_x86_sse2_cvtps2dq:			; SSE-LABEL: test_x86_sse2_cvtps2dq:
	Show All 19 Lines
	;			;
	; AVX2-LABEL: test_x86_sse2_cvtsd2si:			; AVX2-LABEL: test_x86_sse2_cvtsd2si:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvtsd2si %xmm0, %eax ## encoding: [0xc5,0xfb,0x2d,0xc0]			; AVX2-NEXT: vcvtsd2si %xmm0, %eax ## encoding: [0xc5,0xfb,0x2d,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvtsd2si:			; SKX-LABEL: test_x86_sse2_cvtsd2si:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvtsd2si %xmm0, %eax ## encoding: [0x62,0xf1,0x7f,0x08,0x2d,0xc0]			; SKX-NEXT: vcvtsd2si %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x2d,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.cvtsd2si(<2 x double> %a0) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.cvtsd2si(<2 x double> %a0) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.cvtsd2si(<2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.cvtsd2si(<2 x double>) nounwind readnone


	define <4 x float> @test_x86_sse2_cvtsd2ss(<4 x float> %a0, <2 x double> %a1) {			define <4 x float> @test_x86_sse2_cvtsd2ss(<4 x float> %a0, <2 x double> %a1) {
	▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
	;			;
	; AVX2-LABEL: test_x86_sse2_cvtsi2sd:			; AVX2-LABEL: test_x86_sse2_cvtsi2sd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvtsi2sdl {{[0-9]+}}(%esp), %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0x2a,0x44,0x24,0x04]			; AVX2-NEXT: vcvtsi2sdl {{[0-9]+}}(%esp), %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0x2a,0x44,0x24,0x04]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvtsi2sd:			; SKX-LABEL: test_x86_sse2_cvtsi2sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvtsi2sdl {{[0-9]+}}(%esp), %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x2a,0x44,0x24,0x01]			; SKX-NEXT: vcvtsi2sdl {{[0-9]+}}(%esp), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x2a,0x44,0x24,0x04]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double> %a0, i32 %a1) ; <<2 x double>> [#uses=1]			%res = call <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double> %a0, i32 %a1) ; <<2 x double>> [#uses=1]
	ret <2 x double> %res			ret <2 x double> %res
	}			}
	declare <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double>, i32) nounwind readnone			declare <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double>, i32) nounwind readnone


	define <2 x double> @test_x86_sse2_cvtss2sd(<2 x double> %a0, <4 x float> %a1) {			define <2 x double> @test_x86_sse2_cvtss2sd(<2 x double> %a0, <4 x float> %a1) {
	▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
	;			;
	; AVX2-LABEL: test_x86_sse2_cvttpd2dq:			; AVX2-LABEL: test_x86_sse2_cvttpd2dq:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe6,0xc0]			; AVX2-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe6,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvttpd2dq:			; SKX-LABEL: test_x86_sse2_cvttpd2dq:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xe6,0xc0]			; SKX-NEXT: vcvttpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe6,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double>) nounwind readnone


	define <2 x i64> @test_mm_cvttpd_epi32_zext(<2 x double> %a0) nounwind {			define <2 x i64> @test_mm_cvttpd_epi32_zext(<2 x double> %a0) nounwind {
	; SSE-LABEL: test_mm_cvttpd_epi32_zext:			; SSE-LABEL: test_mm_cvttpd_epi32_zext:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvttpd2dq %xmm0, %xmm0 ## encoding: [0x66,0x0f,0xe6,0xc0]			; SSE-NEXT: cvttpd2dq %xmm0, %xmm0 ## encoding: [0x66,0x0f,0xe6,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_mm_cvttpd_epi32_zext:			; AVX2-LABEL: test_mm_cvttpd_epi32_zext:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe6,0xc0]			; AVX2-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe6,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_mm_cvttpd_epi32_zext:			; SKX-LABEL: test_mm_cvttpd_epi32_zext:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvttpd2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xe6,0xc0]			; SKX-NEXT: vcvttpd2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe6,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%cvt = call <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double> %a0)			%cvt = call <4 x i32> @llvm.x86.sse2.cvttpd2dq(<2 x double> %a0)
	%res = shufflevector <4 x i32> %cvt, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>			%res = shufflevector <4 x i32> %cvt, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
	%bc = bitcast <4 x i32> %res to <2 x i64>			%bc = bitcast <4 x i32> %res to <2 x i64>
	ret <2 x i64> %bc			ret <2 x i64> %bc
	}			}


	define <4 x i32> @test_x86_sse2_cvttps2dq(<4 x float> %a0) {			define <4 x i32> @test_x86_sse2_cvttps2dq(<4 x float> %a0) {
	; SSE-LABEL: test_x86_sse2_cvttps2dq:			; SSE-LABEL: test_x86_sse2_cvttps2dq:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvttps2dq %xmm0, %xmm0 ## encoding: [0xf3,0x0f,0x5b,0xc0]			; SSE-NEXT: cvttps2dq %xmm0, %xmm0 ## encoding: [0xf3,0x0f,0x5b,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_cvttps2dq:			; AVX2-LABEL: test_x86_sse2_cvttps2dq:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvttps2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfa,0x5b,0xc0]			; AVX2-NEXT: vcvttps2dq %xmm0, %xmm0 ## encoding: [0xc5,0xfa,0x5b,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvttps2dq:			; SKX-LABEL: test_x86_sse2_cvttps2dq:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvttps2dq %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7e,0x08,0x5b,0xc0]			; SKX-NEXT: vcvttps2dq %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x5b,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float>) nounwind readnone


	define i32 @test_x86_sse2_cvttsd2si(<2 x double> %a0) {			define i32 @test_x86_sse2_cvttsd2si(<2 x double> %a0) {
	; SSE-LABEL: test_x86_sse2_cvttsd2si:			; SSE-LABEL: test_x86_sse2_cvttsd2si:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: cvttsd2si %xmm0, %eax ## encoding: [0xf2,0x0f,0x2c,0xc0]			; SSE-NEXT: cvttsd2si %xmm0, %eax ## encoding: [0xf2,0x0f,0x2c,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_cvttsd2si:			; AVX2-LABEL: test_x86_sse2_cvttsd2si:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vcvttsd2si %xmm0, %eax ## encoding: [0xc5,0xfb,0x2c,0xc0]			; AVX2-NEXT: vcvttsd2si %xmm0, %eax ## encoding: [0xc5,0xfb,0x2c,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_cvttsd2si:			; SKX-LABEL: test_x86_sse2_cvttsd2si:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vcvttsd2si %xmm0, %eax ## encoding: [0x62,0xf1,0x7f,0x08,0x2c,0xc0]			; SKX-NEXT: vcvttsd2si %xmm0, %eax ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x2c,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.cvttsd2si(<2 x double> %a0) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.cvttsd2si(<2 x double> %a0) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.cvttsd2si(<2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.cvttsd2si(<2 x double>) nounwind readnone


	define <2 x double> @test_x86_sse2_max_pd(<2 x double> %a0, <2 x double> %a1) {			define <2 x double> @test_x86_sse2_max_pd(<2 x double> %a0, <2 x double> %a1) {
	; SSE-LABEL: test_x86_sse2_max_pd:			; SSE-LABEL: test_x86_sse2_max_pd:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: maxpd %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x5f,0xc1]			; SSE-NEXT: maxpd %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x5f,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_max_pd:			; AVX2-LABEL: test_x86_sse2_max_pd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vmaxpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5f,0xc1]			; AVX2-NEXT: vmaxpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5f,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_max_pd:			; SKX-LABEL: test_x86_sse2_max_pd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vmaxpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5f,0xc1]			; SKX-NEXT: vmaxpd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5f,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.sse2.max.pd(<2 x double> %a0, <2 x double> %a1) ; <<2 x double>> [#uses=1]			%res = call <2 x double> @llvm.x86.sse2.max.pd(<2 x double> %a0, <2 x double> %a1) ; <<2 x double>> [#uses=1]
	ret <2 x double> %res			ret <2 x double> %res
	}			}
	declare <2 x double> @llvm.x86.sse2.max.pd(<2 x double>, <2 x double>) nounwind readnone			declare <2 x double> @llvm.x86.sse2.max.pd(<2 x double>, <2 x double>) nounwind readnone


	define <2 x double> @test_x86_sse2_max_sd(<2 x double> %a0, <2 x double> %a1) {			define <2 x double> @test_x86_sse2_max_sd(<2 x double> %a0, <2 x double> %a1) {
	Show All 20 Lines
	;			;
	; AVX2-LABEL: test_x86_sse2_min_pd:			; AVX2-LABEL: test_x86_sse2_min_pd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vminpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5d,0xc1]			; AVX2-NEXT: vminpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x5d,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_min_pd:			; SKX-LABEL: test_x86_sse2_min_pd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vminpd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x5d,0xc1]			; SKX-NEXT: vminpd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x5d,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x double> @llvm.x86.sse2.min.pd(<2 x double> %a0, <2 x double> %a1) ; <<2 x double>> [#uses=1]			%res = call <2 x double> @llvm.x86.sse2.min.pd(<2 x double> %a0, <2 x double> %a1) ; <<2 x double>> [#uses=1]
	ret <2 x double> %res			ret <2 x double> %res
	}			}
	declare <2 x double> @llvm.x86.sse2.min.pd(<2 x double>, <2 x double>) nounwind readnone			declare <2 x double> @llvm.x86.sse2.min.pd(<2 x double>, <2 x double>) nounwind readnone


	define <2 x double> @test_x86_sse2_min_sd(<2 x double> %a0, <2 x double> %a1) {			define <2 x double> @test_x86_sse2_min_sd(<2 x double> %a0, <2 x double> %a1) {
	Show All 38 Lines
	;			;
	; AVX2-LABEL: test_x86_sse2_packssdw_128:			; AVX2-LABEL: test_x86_sse2_packssdw_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x6b,0xc1]			; AVX2-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x6b,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_packssdw_128:			; SKX-LABEL: test_x86_sse2_packssdw_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x6b,0xc1]			; SKX-NEXT: vpackssdw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x6b,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.packssdw.128(<4 x i32> %a0, <4 x i32> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.packssdw.128(<4 x i32> %a0, <4 x i32> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.packssdw.128(<4 x i32>, <4 x i32>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.packssdw.128(<4 x i32>, <4 x i32>) nounwind readnone


	define <16 x i8> @test_x86_sse2_packsswb_128(<8 x i16> %a0, <8 x i16> %a1) {			define <16 x i8> @test_x86_sse2_packsswb_128(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_packsswb_128:			; SSE-LABEL: test_x86_sse2_packsswb_128:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: packsswb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x63,0xc1]			; SSE-NEXT: packsswb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x63,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_packsswb_128:			; AVX2-LABEL: test_x86_sse2_packsswb_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x63,0xc1]			; AVX2-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x63,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_packsswb_128:			; SKX-LABEL: test_x86_sse2_packsswb_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x63,0xc1]			; SKX-NEXT: vpacksswb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x63,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.packsswb.128(<8 x i16> %a0, <8 x i16> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.packsswb.128(<8 x i16> %a0, <8 x i16> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.packsswb.128(<8 x i16>, <8 x i16>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.packsswb.128(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_packuswb_128(<8 x i16> %a0, <8 x i16> %a1) {			define <16 x i8> @test_x86_sse2_packuswb_128(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_packuswb_128:			; SSE-LABEL: test_x86_sse2_packuswb_128:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: packuswb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x67,0xc1]			; SSE-NEXT: packuswb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x67,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_packuswb_128:			; AVX2-LABEL: test_x86_sse2_packuswb_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x67,0xc1]			; AVX2-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x67,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_packuswb_128:			; SKX-LABEL: test_x86_sse2_packuswb_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x67,0xc1]			; SKX-NEXT: vpackuswb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x67,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.packuswb.128(<8 x i16> %a0, <8 x i16> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.packuswb.128(<8 x i16> %a0, <8 x i16> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.packuswb.128(<8 x i16>, <8 x i16>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.packuswb.128(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_padds_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_padds_b(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_sse2_padds_b:			; SSE-LABEL: test_x86_sse2_padds_b:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: paddsb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xec,0xc1]			; SSE-NEXT: paddsb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xec,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_padds_b:			; AVX2-LABEL: test_x86_sse2_padds_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xec,0xc1]			; AVX2-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xec,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_padds_b:			; SKX-LABEL: test_x86_sse2_padds_b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xec,0xc1]			; SKX-NEXT: vpaddsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xec,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.padds.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.padds.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.padds.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.padds.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_padds_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_padds_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_padds_w:			; SSE-LABEL: test_x86_sse2_padds_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: paddsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xed,0xc1]			; SSE-NEXT: paddsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xed,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_padds_w:			; AVX2-LABEL: test_x86_sse2_padds_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xed,0xc1]			; AVX2-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xed,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_padds_w:			; SKX-LABEL: test_x86_sse2_padds_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xed,0xc1]			; SKX-NEXT: vpaddsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xed,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.padds.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.padds.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.padds.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.padds.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_paddus_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_paddus_b(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_sse2_paddus_b:			; SSE-LABEL: test_x86_sse2_paddus_b:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: paddusb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xdc,0xc1]			; SSE-NEXT: paddusb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xdc,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_paddus_b:			; AVX2-LABEL: test_x86_sse2_paddus_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xdc,0xc1]			; AVX2-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xdc,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_paddus_b:			; SKX-LABEL: test_x86_sse2_paddus_b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdc,0xc1]			; SKX-NEXT: vpaddusb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdc,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.paddus.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.paddus.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.paddus.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.paddus.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_paddus_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_paddus_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_paddus_w:			; SSE-LABEL: test_x86_sse2_paddus_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: paddusw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xdd,0xc1]			; SSE-NEXT: paddusw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xdd,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_paddus_w:			; AVX2-LABEL: test_x86_sse2_paddus_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xdd,0xc1]			; AVX2-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xdd,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_paddus_w:			; SKX-LABEL: test_x86_sse2_paddus_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xdd,0xc1]			; SKX-NEXT: vpaddusw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xdd,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.paddus.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.paddus.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.paddus.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.paddus.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_pavg_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_pavg_b(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_sse2_pavg_b:			; SSE-LABEL: test_x86_sse2_pavg_b:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pavgb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe0,0xc1]			; SSE-NEXT: pavgb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe0,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pavg_b:			; AVX2-LABEL: test_x86_sse2_pavg_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe0,0xc1]			; AVX2-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe0,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pavg_b:			; SKX-LABEL: test_x86_sse2_pavg_b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe0,0xc1]			; SKX-NEXT: vpavgb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe0,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.pavg.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.pavg.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.pavg.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.pavg.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_pavg_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pavg_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_pavg_w:			; SSE-LABEL: test_x86_sse2_pavg_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pavgw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe3,0xc1]			; SSE-NEXT: pavgw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe3,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pavg_w:			; AVX2-LABEL: test_x86_sse2_pavg_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe3,0xc1]			; AVX2-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe3,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pavg_w:			; SKX-LABEL: test_x86_sse2_pavg_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe3,0xc1]			; SKX-NEXT: vpavgw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe3,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pavg.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_sse2_pmadd_wd(<8 x i16> %a0, <8 x i16> %a1) {			define <4 x i32> @test_x86_sse2_pmadd_wd(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_pmadd_wd:			; SSE-LABEL: test_x86_sse2_pmadd_wd:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pmaddwd %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf5,0xc1]			; SSE-NEXT: pmaddwd %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf5,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pmadd_wd:			; AVX2-LABEL: test_x86_sse2_pmadd_wd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf5,0xc1]			; AVX2-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf5,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pmadd_wd:			; SKX-LABEL: test_x86_sse2_pmadd_wd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf5,0xc1]			; SKX-NEXT: vpmaddwd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf5,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a0, <8 x i16> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a0, <8 x i16> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>) nounwind readnone


	define <8 x i16> @test_x86_sse2_pmaxs_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pmaxs_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_pmaxs_w:			; SSE-LABEL: test_x86_sse2_pmaxs_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pmaxsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xee,0xc1]			; SSE-NEXT: pmaxsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xee,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pmaxs_w:			; AVX2-LABEL: test_x86_sse2_pmaxs_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xee,0xc1]			; AVX2-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xee,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pmaxs_w:			; SKX-LABEL: test_x86_sse2_pmaxs_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xee,0xc1]			; SKX-NEXT: vpmaxsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xee,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pmaxs.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_pmaxu_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_pmaxu_b(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_sse2_pmaxu_b:			; SSE-LABEL: test_x86_sse2_pmaxu_b:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pmaxub %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xde,0xc1]			; SSE-NEXT: pmaxub %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xde,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pmaxu_b:			; AVX2-LABEL: test_x86_sse2_pmaxu_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xde,0xc1]			; AVX2-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xde,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pmaxu_b:			; SKX-LABEL: test_x86_sse2_pmaxu_b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xde,0xc1]			; SKX-NEXT: vpmaxub %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xde,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.pmaxu.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_pmins_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pmins_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_pmins_w:			; SSE-LABEL: test_x86_sse2_pmins_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pminsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xea,0xc1]			; SSE-NEXT: pminsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xea,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pmins_w:			; AVX2-LABEL: test_x86_sse2_pmins_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xea,0xc1]			; AVX2-NEXT: vpminsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xea,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pmins_w:			; SKX-LABEL: test_x86_sse2_pmins_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpminsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xea,0xc1]			; SKX-NEXT: vpminsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xea,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pmins.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_pminu_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_pminu_b(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_sse2_pminu_b:			; SSE-LABEL: test_x86_sse2_pminu_b:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pminub %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xda,0xc1]			; SSE-NEXT: pminub %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xda,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pminu_b:			; AVX2-LABEL: test_x86_sse2_pminu_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminub %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xda,0xc1]			; AVX2-NEXT: vpminub %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xda,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pminu_b:			; SKX-LABEL: test_x86_sse2_pminu_b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpminub %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xda,0xc1]			; SKX-NEXT: vpminub %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xda,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.pminu.b(<16 x i8>, <16 x i8>) nounwind readnone


	define i32 @test_x86_sse2_pmovmskb_128(<16 x i8> %a0) {			define i32 @test_x86_sse2_pmovmskb_128(<16 x i8> %a0) {
	Show All 20 Lines
	;			;
	; AVX2-LABEL: test_x86_sse2_pmulh_w:			; AVX2-LABEL: test_x86_sse2_pmulh_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe5,0xc1]			; AVX2-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe5,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pmulh_w:			; SKX-LABEL: test_x86_sse2_pmulh_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe5,0xc1]			; SKX-NEXT: vpmulhw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe5,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pmulh.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pmulh.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pmulh.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pmulh.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <8 x i16> @test_x86_sse2_pmulhu_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_pmulhu_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_pmulhu_w:			; SSE-LABEL: test_x86_sse2_pmulhu_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pmulhuw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe4,0xc1]			; SSE-NEXT: pmulhuw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe4,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pmulhu_w:			; AVX2-LABEL: test_x86_sse2_pmulhu_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe4,0xc1]			; AVX2-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe4,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pmulhu_w:			; SKX-LABEL: test_x86_sse2_pmulhu_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe4,0xc1]			; SKX-NEXT: vpmulhuw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe4,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pmulhu.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pmulhu.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pmulhu.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pmulhu.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <2 x i64> @test_x86_sse2_pmulu_dq(<4 x i32> %a0, <4 x i32> %a1) {			define <2 x i64> @test_x86_sse2_pmulu_dq(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE-LABEL: test_x86_sse2_pmulu_dq:			; SSE-LABEL: test_x86_sse2_pmulu_dq:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pmuludq %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf4,0xc1]			; SSE-NEXT: pmuludq %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf4,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pmulu_dq:			; AVX2-LABEL: test_x86_sse2_pmulu_dq:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf4,0xc1]			; AVX2-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf4,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pmulu_dq:			; SKX-LABEL: test_x86_sse2_pmulu_dq:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xf4,0xc1]			; SKX-NEXT: vpmuludq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf4,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.pmulu.dq(<4 x i32> %a0, <4 x i32> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.pmulu.dq(<4 x i32> %a0, <4 x i32> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.pmulu.dq(<4 x i32>, <4 x i32>) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.pmulu.dq(<4 x i32>, <4 x i32>) nounwind readnone


	define <2 x i64> @test_x86_sse2_psad_bw(<16 x i8> %a0, <16 x i8> %a1) {			define <2 x i64> @test_x86_sse2_psad_bw(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_sse2_psad_bw:			; SSE-LABEL: test_x86_sse2_psad_bw:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psadbw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf6,0xc1]			; SSE-NEXT: psadbw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf6,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psad_bw:			; AVX2-LABEL: test_x86_sse2_psad_bw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf6,0xc1]			; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psad_bw:			; SKX-LABEL: test_x86_sse2_psad_bw:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf6,0xc1]			; SKX-NEXT: vpsadbw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.psad.bw(<16 x i8> %a0, <16 x i8> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.psad.bw(<16 x i8> %a0, <16 x i8> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.psad.bw(<16 x i8>, <16 x i8>) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.psad.bw(<16 x i8>, <16 x i8>) nounwind readnone


	define <4 x i32> @test_x86_sse2_psll_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse2_psll_d(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE-LABEL: test_x86_sse2_psll_d:			; SSE-LABEL: test_x86_sse2_psll_d:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pslld %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf2,0xc1]			; SSE-NEXT: pslld %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf2,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psll_d:			; AVX2-LABEL: test_x86_sse2_psll_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpslld %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf2,0xc1]			; AVX2-NEXT: vpslld %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf2,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psll_d:			; SKX-LABEL: test_x86_sse2_psll_d:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpslld %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf2,0xc1]			; SKX-NEXT: vpslld %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf2,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psll.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psll.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psll.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psll.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <2 x i64> @test_x86_sse2_psll_q(<2 x i64> %a0, <2 x i64> %a1) {			define <2 x i64> @test_x86_sse2_psll_q(<2 x i64> %a0, <2 x i64> %a1) {
	; SSE-LABEL: test_x86_sse2_psll_q:			; SSE-LABEL: test_x86_sse2_psll_q:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psllq %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf3,0xc1]			; SSE-NEXT: psllq %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf3,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psll_q:			; AVX2-LABEL: test_x86_sse2_psll_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf3,0xc1]			; AVX2-NEXT: vpsllq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf3,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psll_q:			; SKX-LABEL: test_x86_sse2_psll_q:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsllq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xf3,0xc1]			; SKX-NEXT: vpsllq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf3,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64>, <2 x i64>) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64>, <2 x i64>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psll_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psll_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_psll_w:			; SSE-LABEL: test_x86_sse2_psll_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psllw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf1,0xc1]			; SSE-NEXT: psllw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xf1,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psll_w:			; AVX2-LABEL: test_x86_sse2_psll_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf1,0xc1]			; AVX2-NEXT: vpsllw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xf1,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psll_w:			; SKX-LABEL: test_x86_sse2_psll_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsllw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xf1,0xc1]			; SKX-NEXT: vpsllw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xf1,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psll.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psll.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psll.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psll.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_sse2_pslli_d(<4 x i32> %a0) {			define <4 x i32> @test_x86_sse2_pslli_d(<4 x i32> %a0) {
	; SSE-LABEL: test_x86_sse2_pslli_d:			; SSE-LABEL: test_x86_sse2_pslli_d:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pslld $7, %xmm0 ## encoding: [0x66,0x0f,0x72,0xf0,0x07]			; SSE-NEXT: pslld $7, %xmm0 ## encoding: [0x66,0x0f,0x72,0xf0,0x07]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pslli_d:			; AVX2-LABEL: test_x86_sse2_pslli_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpslld $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xf0,0x07]			; AVX2-NEXT: vpslld $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xf0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pslli_d:			; SKX-LABEL: test_x86_sse2_pslli_d:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpslld $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xf0,0x07]			; SKX-NEXT: vpslld $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x72,0xf0,0x07]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.pslli.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.pslli.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.pslli.d(<4 x i32>, i32) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.pslli.d(<4 x i32>, i32) nounwind readnone


	define <2 x i64> @test_x86_sse2_pslli_q(<2 x i64> %a0) {			define <2 x i64> @test_x86_sse2_pslli_q(<2 x i64> %a0) {
	; SSE-LABEL: test_x86_sse2_pslli_q:			; SSE-LABEL: test_x86_sse2_pslli_q:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psllq $7, %xmm0 ## encoding: [0x66,0x0f,0x73,0xf0,0x07]			; SSE-NEXT: psllq $7, %xmm0 ## encoding: [0x66,0x0f,0x73,0xf0,0x07]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pslli_q:			; AVX2-LABEL: test_x86_sse2_pslli_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllq $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x73,0xf0,0x07]			; AVX2-NEXT: vpsllq $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x73,0xf0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pslli_q:			; SKX-LABEL: test_x86_sse2_pslli_q:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsllq $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x73,0xf0,0x07]			; SKX-NEXT: vpsllq $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x73,0xf0,0x07]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.pslli.q(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.pslli.q(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.pslli.q(<2 x i64>, i32) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.pslli.q(<2 x i64>, i32) nounwind readnone


	define <8 x i16> @test_x86_sse2_pslli_w(<8 x i16> %a0) {			define <8 x i16> @test_x86_sse2_pslli_w(<8 x i16> %a0) {
	; SSE-LABEL: test_x86_sse2_pslli_w:			; SSE-LABEL: test_x86_sse2_pslli_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psllw $7, %xmm0 ## encoding: [0x66,0x0f,0x71,0xf0,0x07]			; SSE-NEXT: psllw $7, %xmm0 ## encoding: [0x66,0x0f,0x71,0xf0,0x07]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_pslli_w:			; AVX2-LABEL: test_x86_sse2_pslli_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsllw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xf0,0x07]			; AVX2-NEXT: vpsllw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xf0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_pslli_w:			; SKX-LABEL: test_x86_sse2_pslli_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsllw $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x71,0xf0,0x07]			; SKX-NEXT: vpsllw $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x71,0xf0,0x07]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.pslli.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.pslli.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.pslli.w(<8 x i16>, i32) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.pslli.w(<8 x i16>, i32) nounwind readnone


	define <4 x i32> @test_x86_sse2_psra_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse2_psra_d(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE-LABEL: test_x86_sse2_psra_d:			; SSE-LABEL: test_x86_sse2_psra_d:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psrad %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe2,0xc1]			; SSE-NEXT: psrad %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe2,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psra_d:			; AVX2-LABEL: test_x86_sse2_psra_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrad %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe2,0xc1]			; AVX2-NEXT: vpsrad %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe2,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psra_d:			; SKX-LABEL: test_x86_sse2_psra_d:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsrad %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe2,0xc1]			; SKX-NEXT: vpsrad %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe2,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psra.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psra_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psra_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_psra_w:			; SSE-LABEL: test_x86_sse2_psra_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psraw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe1,0xc1]			; SSE-NEXT: psraw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe1,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psra_w:			; AVX2-LABEL: test_x86_sse2_psra_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsraw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe1,0xc1]			; AVX2-NEXT: vpsraw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe1,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psra_w:			; SKX-LABEL: test_x86_sse2_psra_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsraw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe1,0xc1]			; SKX-NEXT: vpsraw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe1,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psra.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psra.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psra.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psra.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_sse2_psrai_d(<4 x i32> %a0) {			define <4 x i32> @test_x86_sse2_psrai_d(<4 x i32> %a0) {
	; SSE-LABEL: test_x86_sse2_psrai_d:			; SSE-LABEL: test_x86_sse2_psrai_d:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psrad $7, %xmm0 ## encoding: [0x66,0x0f,0x72,0xe0,0x07]			; SSE-NEXT: psrad $7, %xmm0 ## encoding: [0x66,0x0f,0x72,0xe0,0x07]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psrai_d:			; AVX2-LABEL: test_x86_sse2_psrai_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrad $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xe0,0x07]			; AVX2-NEXT: vpsrad $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xe0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psrai_d:			; SKX-LABEL: test_x86_sse2_psrai_d:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsrad $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xe0,0x07]			; SKX-NEXT: vpsrad $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x72,0xe0,0x07]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psrai.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psrai.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psrai.d(<4 x i32>, i32) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psrai.d(<4 x i32>, i32) nounwind readnone


	define <8 x i16> @test_x86_sse2_psrai_w(<8 x i16> %a0) {			define <8 x i16> @test_x86_sse2_psrai_w(<8 x i16> %a0) {
	; SSE-LABEL: test_x86_sse2_psrai_w:			; SSE-LABEL: test_x86_sse2_psrai_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psraw $7, %xmm0 ## encoding: [0x66,0x0f,0x71,0xe0,0x07]			; SSE-NEXT: psraw $7, %xmm0 ## encoding: [0x66,0x0f,0x71,0xe0,0x07]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psrai_w:			; AVX2-LABEL: test_x86_sse2_psrai_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsraw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xe0,0x07]			; AVX2-NEXT: vpsraw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xe0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psrai_w:			; SKX-LABEL: test_x86_sse2_psrai_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsraw $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x71,0xe0,0x07]			; SKX-NEXT: vpsraw $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x71,0xe0,0x07]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psrai.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psrai.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psrai.w(<8 x i16>, i32) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psrai.w(<8 x i16>, i32) nounwind readnone


	define <4 x i32> @test_x86_sse2_psrl_d(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse2_psrl_d(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE-LABEL: test_x86_sse2_psrl_d:			; SSE-LABEL: test_x86_sse2_psrl_d:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psrld %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd2,0xc1]			; SSE-NEXT: psrld %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd2,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psrl_d:			; AVX2-LABEL: test_x86_sse2_psrl_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrld %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd2,0xc1]			; AVX2-NEXT: vpsrld %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd2,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psrl_d:			; SKX-LABEL: test_x86_sse2_psrl_d:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsrld %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd2,0xc1]			; SKX-NEXT: vpsrld %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd2,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32>, <4 x i32>) nounwind readnone


	define <2 x i64> @test_x86_sse2_psrl_q(<2 x i64> %a0, <2 x i64> %a1) {			define <2 x i64> @test_x86_sse2_psrl_q(<2 x i64> %a0, <2 x i64> %a1) {
	; SSE-LABEL: test_x86_sse2_psrl_q:			; SSE-LABEL: test_x86_sse2_psrl_q:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psrlq %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd3,0xc1]			; SSE-NEXT: psrlq %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd3,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psrl_q:			; AVX2-LABEL: test_x86_sse2_psrl_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd3,0xc1]			; AVX2-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd3,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psrl_q:			; SKX-LABEL: test_x86_sse2_psrl_q:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0xd3,0xc1]			; SKX-NEXT: vpsrlq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd3,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.psrl.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.psrl.q(<2 x i64> %a0, <2 x i64> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.psrl.q(<2 x i64>, <2 x i64>) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.psrl.q(<2 x i64>, <2 x i64>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psrl_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psrl_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_psrl_w:			; SSE-LABEL: test_x86_sse2_psrl_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psrlw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd1,0xc1]			; SSE-NEXT: psrlw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd1,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psrl_w:			; AVX2-LABEL: test_x86_sse2_psrl_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd1,0xc1]			; AVX2-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd1,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psrl_w:			; SKX-LABEL: test_x86_sse2_psrl_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd1,0xc1]			; SKX-NEXT: vpsrlw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd1,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psrl.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psrl.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psrl.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psrl.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_sse2_psrli_d(<4 x i32> %a0) {			define <4 x i32> @test_x86_sse2_psrli_d(<4 x i32> %a0) {
	; SSE-LABEL: test_x86_sse2_psrli_d:			; SSE-LABEL: test_x86_sse2_psrli_d:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psrld $7, %xmm0 ## encoding: [0x66,0x0f,0x72,0xd0,0x07]			; SSE-NEXT: psrld $7, %xmm0 ## encoding: [0x66,0x0f,0x72,0xd0,0x07]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psrli_d:			; AVX2-LABEL: test_x86_sse2_psrli_d:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrld $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xd0,0x07]			; AVX2-NEXT: vpsrld $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x72,0xd0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psrli_d:			; SKX-LABEL: test_x86_sse2_psrli_d:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsrld $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x72,0xd0,0x07]			; SKX-NEXT: vpsrld $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x72,0xd0,0x07]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse2.psrli.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse2.psrli.d(<4 x i32> %a0, i32 7) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse2.psrli.d(<4 x i32>, i32) nounwind readnone			declare <4 x i32> @llvm.x86.sse2.psrli.d(<4 x i32>, i32) nounwind readnone


	define <2 x i64> @test_x86_sse2_psrli_q(<2 x i64> %a0) {			define <2 x i64> @test_x86_sse2_psrli_q(<2 x i64> %a0) {
	; SSE-LABEL: test_x86_sse2_psrli_q:			; SSE-LABEL: test_x86_sse2_psrli_q:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psrlq $7, %xmm0 ## encoding: [0x66,0x0f,0x73,0xd0,0x07]			; SSE-NEXT: psrlq $7, %xmm0 ## encoding: [0x66,0x0f,0x73,0xd0,0x07]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psrli_q:			; AVX2-LABEL: test_x86_sse2_psrli_q:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlq $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x73,0xd0,0x07]			; AVX2-NEXT: vpsrlq $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x73,0xd0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psrli_q:			; SKX-LABEL: test_x86_sse2_psrli_q:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsrlq $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x73,0xd0,0x07]			; SKX-NEXT: vpsrlq $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x73,0xd0,0x07]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse2.psrli.q(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse2.psrli.q(<2 x i64> %a0, i32 7) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse2.psrli.q(<2 x i64>, i32) nounwind readnone			declare <2 x i64> @llvm.x86.sse2.psrli.q(<2 x i64>, i32) nounwind readnone


	define <8 x i16> @test_x86_sse2_psrli_w(<8 x i16> %a0) {			define <8 x i16> @test_x86_sse2_psrli_w(<8 x i16> %a0) {
	; SSE-LABEL: test_x86_sse2_psrli_w:			; SSE-LABEL: test_x86_sse2_psrli_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psrlw $7, %xmm0 ## encoding: [0x66,0x0f,0x71,0xd0,0x07]			; SSE-NEXT: psrlw $7, %xmm0 ## encoding: [0x66,0x0f,0x71,0xd0,0x07]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psrli_w:			; AVX2-LABEL: test_x86_sse2_psrli_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsrlw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xd0,0x07]			; AVX2-NEXT: vpsrlw $7, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x71,0xd0,0x07]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psrli_w:			; SKX-LABEL: test_x86_sse2_psrli_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsrlw $7, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0x71,0xd0,0x07]			; SKX-NEXT: vpsrlw $7, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x71,0xd0,0x07]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psrli.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psrli.w(<8 x i16> %a0, i32 7) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psrli.w(<8 x i16>, i32) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psrli.w(<8 x i16>, i32) nounwind readnone


	define <16 x i8> @test_x86_sse2_psubs_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_psubs_b(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_sse2_psubs_b:			; SSE-LABEL: test_x86_sse2_psubs_b:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psubsb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe8,0xc1]			; SSE-NEXT: psubsb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe8,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psubs_b:			; AVX2-LABEL: test_x86_sse2_psubs_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe8,0xc1]			; AVX2-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe8,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psubs_b:			; SKX-LABEL: test_x86_sse2_psubs_b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe8,0xc1]			; SKX-NEXT: vpsubsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe8,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.psubs.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.psubs.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.psubs.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.psubs.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psubs_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psubs_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_psubs_w:			; SSE-LABEL: test_x86_sse2_psubs_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psubsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe9,0xc1]			; SSE-NEXT: psubsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xe9,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psubs_w:			; AVX2-LABEL: test_x86_sse2_psubs_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe9,0xc1]			; AVX2-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xe9,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psubs_w:			; SKX-LABEL: test_x86_sse2_psubs_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xe9,0xc1]			; SKX-NEXT: vpsubsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xe9,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psubs.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psubs.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psubs.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psubs.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse2_psubus_b(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse2_psubus_b(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_sse2_psubus_b:			; SSE-LABEL: test_x86_sse2_psubus_b:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psubusb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd8,0xc1]			; SSE-NEXT: psubusb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd8,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psubus_b:			; AVX2-LABEL: test_x86_sse2_psubus_b:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd8,0xc1]			; AVX2-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd8,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psubus_b:			; SKX-LABEL: test_x86_sse2_psubus_b:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd8,0xc1]			; SKX-NEXT: vpsubusb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd8,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse2.psubus.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse2.psubus.b(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse2.psubus.b(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse2.psubus.b(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_sse2_psubus_w(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse2_psubus_w(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_sse2_psubus_w:			; SSE-LABEL: test_x86_sse2_psubus_w:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: psubusw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd9,0xc1]			; SSE-NEXT: psubusw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0xd9,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse2_psubus_w:			; AVX2-LABEL: test_x86_sse2_psubus_w:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd9,0xc1]			; AVX2-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0xd9,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_psubus_w:			; SKX-LABEL: test_x86_sse2_psubus_w:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf1,0x7d,0x08,0xd9,0xc1]			; SKX-NEXT: vpsubusw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0xd9,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse2.psubus.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse2.psubus.w(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse2.psubus.w(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse2.psubus.w(<8 x i16>, <8 x i16>) nounwind readnone


	define <2 x double> @test_x86_sse2_sqrt_pd(<2 x double> %a0) {			define <2 x double> @test_x86_sse2_sqrt_pd(<2 x double> %a0) {
	▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; AVX2-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; AVX2-NEXT: vmovaps (%eax), %xmm0 ## encoding: [0xc5,0xf8,0x28,0x00]			; AVX2-NEXT: vmovaps (%eax), %xmm0 ## encoding: [0xc5,0xf8,0x28,0x00]
	; AVX2-NEXT: vsqrtsd %xmm0, %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0x51,0xc0]			; AVX2-NEXT: vsqrtsd %xmm0, %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0x51,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_sqrt_sd_vec_load:			; SKX-LABEL: test_x86_sse2_sqrt_sd_vec_load:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; SKX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; SKX-NEXT: vmovaps (%eax), %xmm0 ## encoding: [0x62,0xf1,0x7c,0x08,0x28,0x00]			; SKX-NEXT: vmovaps (%eax), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf8,0x28,0x00]
	; SKX-NEXT: vsqrtsd %xmm0, %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0x51,0xc0]			; SKX-NEXT: vsqrtsd %xmm0, %xmm0, %xmm0 ## encoding: [0xc5,0xfb,0x51,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%a1 = load <2 x double>, <2 x double>* %a0, align 16			%a1 = load <2 x double>, <2 x double>* %a0, align 16
	%res = call <2 x double> @llvm.x86.sse2.sqrt.sd(<2 x double> %a1) ; <<2 x double>> [#uses=1]			%res = call <2 x double> @llvm.x86.sse2.sqrt.sd(<2 x double> %a1) ; <<2 x double>> [#uses=1]
	ret <2 x double> %res			ret <2 x double> %res
	}			}


	Show All 13 Lines
	; AVX2-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; AVX2-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; AVX2-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; AVX2-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; AVX2-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; AVX2-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_ucomieq_sd:			; SKX-LABEL: test_x86_sse2_ucomieq_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc1]			; SKX-NEXT: vucomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc1]
	; SKX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]			; SKX-NEXT: setnp %al ## encoding: [0x0f,0x9b,0xc0]
	; SKX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]			; SKX-NEXT: sete %cl ## encoding: [0x0f,0x94,0xc1]
	; SKX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]			; SKX-NEXT: andb %al, %cl ## encoding: [0x20,0xc1]
	; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomieq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomieq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	Show All 13 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]			; AVX2-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_ucomige_sd:			; SKX-LABEL: test_x86_sse2_ucomige_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc1]			; SKX-NEXT: vucomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc1]
	; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]			; AVX2-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0xc5,0xf9,0x2e,0xc1]
	; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_ucomigt_sd:			; SKX-LABEL: test_x86_sse2_ucomigt_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc1]			; SKX-NEXT: vucomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc1]
	; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2e,0xc8]			; AVX2-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2e,0xc8]
	; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; AVX2-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_ucomile_sd:			; SKX-LABEL: test_x86_sse2_ucomile_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc8]			; SKX-NEXT: vucomisd %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc8]
	; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]			; SKX-NEXT: setae %al ## encoding: [0x0f,0x93,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone


	Show All 10 Lines
	; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; AVX2-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; AVX2-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2e,0xc8]			; AVX2-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0xc5,0xf9,0x2e,0xc8]
	; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; AVX2-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_ucomilt_sd:			; SKX-LABEL: test_x86_sse2_ucomilt_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]			; SKX-NEXT: xorl %eax, %eax ## encoding: [0x31,0xc0]
	; SKX-NEXT: vucomisd %xmm0, %xmm1 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc8]			; SKX-NEXT: vucomisd %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc8]
	; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]			; SKX-NEXT: seta %al ## encoding: [0x0f,0x97,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone			declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone


	Show All 13 Lines
	; AVX2-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; AVX2-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; AVX2-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; AVX2-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; AVX2-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; AVX2-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; AVX2-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse2_ucomineq_sd:			; SKX-LABEL: test_x86_sse2_ucomineq_sd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vucomisd %xmm1, %xmm0 ## encoding: [0x62,0xf1,0xfd,0x08,0x2e,0xc1]			; SKX-NEXT: vucomisd %xmm1, %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xf9,0x2e,0xc1]
	; SKX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]			; SKX-NEXT: setp %al ## encoding: [0x0f,0x9a,0xc0]
	; SKX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]			; SKX-NEXT: setne %cl ## encoding: [0x0f,0x95,0xc1]
	; SKX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]			; SKX-NEXT: orb %al, %cl ## encoding: [0x08,0xc1]
	; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]			; SKX-NEXT: movzbl %cl, %eax ## encoding: [0x0f,0xb6,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call i32 @llvm.x86.sse2.ucomineq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse2.ucomineq.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	Show All 16 Lines

llvm/trunk/test/CodeGen/X86/sse41-intrinsics-x86.ll

	Show First 20 Lines • Show All 121 Lines • ▼ Show 20 Lines
	;			;
	; AVX2-LABEL: test_x86_sse41_packusdw:			; AVX2-LABEL: test_x86_sse41_packusdw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x2b,0xc1]			; AVX2-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x2b,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_packusdw:			; SKX-LABEL: test_x86_sse41_packusdw:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x2b,0xc1]			; SKX-NEXT: vpackusdw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x2b,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse41.packusdw(<4 x i32> %a0, <4 x i32> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse41.packusdw(<4 x i32> %a0, <4 x i32> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse41.packusdw(<4 x i32>, <4 x i32>) nounwind readnone			declare <8 x i16> @llvm.x86.sse41.packusdw(<4 x i32>, <4 x i32>) nounwind readnone


	define <16 x i8> @test_x86_sse41_pblendvb(<16 x i8> %a0, <16 x i8> %a1, <16 x i8> %a2) {			define <16 x i8> @test_x86_sse41_pblendvb(<16 x i8> %a0, <16 x i8> %a1, <16 x i8> %a2) {
	Show All 39 Lines
	;			;
	; AVX2-LABEL: test_x86_sse41_pmaxsb:			; AVX2-LABEL: test_x86_sse41_pmaxsb:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3c,0xc1]			; AVX2-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3c,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pmaxsb:			; SKX-LABEL: test_x86_sse41_pmaxsb:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3c,0xc1]			; SKX-NEXT: vpmaxsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3c,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse41.pmaxsb(<16 x i8>, <16 x i8>) nounwind readnone


	define <4 x i32> @test_x86_sse41_pmaxsd(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse41_pmaxsd(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE41-LABEL: test_x86_sse41_pmaxsd:			; SSE41-LABEL: test_x86_sse41_pmaxsd:
	; SSE41: ## BB#0:			; SSE41: ## BB#0:
	; SSE41-NEXT: pmaxsd %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3d,0xc1]			; SSE41-NEXT: pmaxsd %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3d,0xc1]
	; SSE41-NEXT: retl ## encoding: [0xc3]			; SSE41-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse41_pmaxsd:			; AVX2-LABEL: test_x86_sse41_pmaxsd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3d,0xc1]			; AVX2-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3d,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pmaxsd:			; SKX-LABEL: test_x86_sse41_pmaxsd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3d,0xc1]			; SKX-NEXT: vpmaxsd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3d,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse41.pmaxsd(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse41.pmaxsd(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse41.pmaxsd(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse41.pmaxsd(<4 x i32>, <4 x i32>) nounwind readnone


	define <4 x i32> @test_x86_sse41_pmaxud(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse41_pmaxud(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE41-LABEL: test_x86_sse41_pmaxud:			; SSE41-LABEL: test_x86_sse41_pmaxud:
	; SSE41: ## BB#0:			; SSE41: ## BB#0:
	; SSE41-NEXT: pmaxud %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3f,0xc1]			; SSE41-NEXT: pmaxud %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3f,0xc1]
	; SSE41-NEXT: retl ## encoding: [0xc3]			; SSE41-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse41_pmaxud:			; AVX2-LABEL: test_x86_sse41_pmaxud:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3f,0xc1]			; AVX2-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3f,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pmaxud:			; SKX-LABEL: test_x86_sse41_pmaxud:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3f,0xc1]			; SKX-NEXT: vpmaxud %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3f,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse41.pmaxud(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse41.pmaxud(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse41.pmaxud(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse41.pmaxud(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i16> @test_x86_sse41_pmaxuw(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse41_pmaxuw(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE41-LABEL: test_x86_sse41_pmaxuw:			; SSE41-LABEL: test_x86_sse41_pmaxuw:
	; SSE41: ## BB#0:			; SSE41: ## BB#0:
	; SSE41-NEXT: pmaxuw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3e,0xc1]			; SSE41-NEXT: pmaxuw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3e,0xc1]
	; SSE41-NEXT: retl ## encoding: [0xc3]			; SSE41-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse41_pmaxuw:			; AVX2-LABEL: test_x86_sse41_pmaxuw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaxuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3e,0xc1]			; AVX2-NEXT: vpmaxuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3e,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pmaxuw:			; SKX-LABEL: test_x86_sse41_pmaxuw:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmaxuw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3e,0xc1]			; SKX-NEXT: vpmaxuw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3e,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse41.pmaxuw(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse41.pmaxuw(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse41.pmaxuw(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse41.pmaxuw(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_sse41_pminsb(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_sse41_pminsb(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE41-LABEL: test_x86_sse41_pminsb:			; SSE41-LABEL: test_x86_sse41_pminsb:
	; SSE41: ## BB#0:			; SSE41: ## BB#0:
	; SSE41-NEXT: pminsb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x38,0xc1]			; SSE41-NEXT: pminsb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x38,0xc1]
	; SSE41-NEXT: retl ## encoding: [0xc3]			; SSE41-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse41_pminsb:			; AVX2-LABEL: test_x86_sse41_pminsb:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x38,0xc1]			; AVX2-NEXT: vpminsb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x38,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pminsb:			; SKX-LABEL: test_x86_sse41_pminsb:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpminsb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x38,0xc1]			; SKX-NEXT: vpminsb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x38,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.sse41.pminsb(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.sse41.pminsb(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.sse41.pminsb(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.sse41.pminsb(<16 x i8>, <16 x i8>) nounwind readnone


	define <4 x i32> @test_x86_sse41_pminsd(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse41_pminsd(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE41-LABEL: test_x86_sse41_pminsd:			; SSE41-LABEL: test_x86_sse41_pminsd:
	; SSE41: ## BB#0:			; SSE41: ## BB#0:
	; SSE41-NEXT: pminsd %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x39,0xc1]			; SSE41-NEXT: pminsd %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x39,0xc1]
	; SSE41-NEXT: retl ## encoding: [0xc3]			; SSE41-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse41_pminsd:			; AVX2-LABEL: test_x86_sse41_pminsd:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x39,0xc1]			; AVX2-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x39,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pminsd:			; SKX-LABEL: test_x86_sse41_pminsd:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x39,0xc1]			; SKX-NEXT: vpminsd %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x39,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse41.pminsd(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse41.pminsd(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse41.pminsd(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse41.pminsd(<4 x i32>, <4 x i32>) nounwind readnone


	define <4 x i32> @test_x86_sse41_pminud(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_sse41_pminud(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE41-LABEL: test_x86_sse41_pminud:			; SSE41-LABEL: test_x86_sse41_pminud:
	; SSE41: ## BB#0:			; SSE41: ## BB#0:
	; SSE41-NEXT: pminud %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3b,0xc1]			; SSE41-NEXT: pminud %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3b,0xc1]
	; SSE41-NEXT: retl ## encoding: [0xc3]			; SSE41-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse41_pminud:			; AVX2-LABEL: test_x86_sse41_pminud:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminud %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3b,0xc1]			; AVX2-NEXT: vpminud %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3b,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pminud:			; SKX-LABEL: test_x86_sse41_pminud:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpminud %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3b,0xc1]			; SKX-NEXT: vpminud %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3b,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.sse41.pminud(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.sse41.pminud(<4 x i32> %a0, <4 x i32> %a1) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.sse41.pminud(<4 x i32>, <4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.sse41.pminud(<4 x i32>, <4 x i32>) nounwind readnone


	define <8 x i16> @test_x86_sse41_pminuw(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_sse41_pminuw(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE41-LABEL: test_x86_sse41_pminuw:			; SSE41-LABEL: test_x86_sse41_pminuw:
	; SSE41: ## BB#0:			; SSE41: ## BB#0:
	; SSE41-NEXT: pminuw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3a,0xc1]			; SSE41-NEXT: pminuw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x3a,0xc1]
	; SSE41-NEXT: retl ## encoding: [0xc3]			; SSE41-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse41_pminuw:			; AVX2-LABEL: test_x86_sse41_pminuw:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpminuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3a,0xc1]			; AVX2-NEXT: vpminuw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x3a,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pminuw:			; SKX-LABEL: test_x86_sse41_pminuw:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpminuw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x3a,0xc1]			; SKX-NEXT: vpminuw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x3a,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.sse41.pminuw(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.sse41.pminuw(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.sse41.pminuw(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.sse41.pminuw(<8 x i16>, <8 x i16>) nounwind readnone


	define <2 x i64> @test_x86_sse41_pmuldq(<4 x i32> %a0, <4 x i32> %a1) {			define <2 x i64> @test_x86_sse41_pmuldq(<4 x i32> %a0, <4 x i32> %a1) {
	; SSE41-LABEL: test_x86_sse41_pmuldq:			; SSE41-LABEL: test_x86_sse41_pmuldq:
	; SSE41: ## BB#0:			; SSE41: ## BB#0:
	; SSE41-NEXT: pmuldq %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x28,0xc1]			; SSE41-NEXT: pmuldq %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x28,0xc1]
	; SSE41-NEXT: retl ## encoding: [0xc3]			; SSE41-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_sse41_pmuldq:			; AVX2-LABEL: test_x86_sse41_pmuldq:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x28,0xc1]			; AVX2-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x28,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse41_pmuldq:			; SKX-LABEL: test_x86_sse41_pmuldq:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0xfd,0x08,0x28,0xc1]			; SKX-NEXT: vpmuldq %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x28,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <2 x i64> @llvm.x86.sse41.pmuldq(<4 x i32> %a0, <4 x i32> %a1) ; <<2 x i64>> [#uses=1]			%res = call <2 x i64> @llvm.x86.sse41.pmuldq(<4 x i32> %a0, <4 x i32> %a1) ; <<2 x i64>> [#uses=1]
	ret <2 x i64> %res			ret <2 x i64> %res
	}			}
	declare <2 x i64> @llvm.x86.sse41.pmuldq(<4 x i32>, <4 x i32>) nounwind readnone			declare <2 x i64> @llvm.x86.sse41.pmuldq(<4 x i32>, <4 x i32>) nounwind readnone


	define i32 @test_x86_sse41_ptestc(<2 x i64> %a0, <2 x i64> %a1) {			define i32 @test_x86_sse41_ptestc(<2 x i64> %a0, <2 x i64> %a1) {
	▲ Show 20 Lines • Show All 139 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sse42-intrinsics-x86.ll

	Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpcmpestri $7, (%ecx), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x61,0x01,0x07]			; AVX2-NEXT: vpcmpestri $7, (%ecx), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x61,0x01,0x07]
	; AVX2-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]			; AVX2-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse42_pcmpestri128_load:			; SKX-LABEL: test_x86_sse42_pcmpestri128_load:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x08]			; SKX-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x08]
	; SKX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]			; SKX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]
	; SKX-NEXT: vmovdqu8 (%eax), %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x6f,0x00]			; SKX-NEXT: vmovdqu (%eax), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x00]
	; SKX-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]			; SKX-NEXT: movl $7, %eax ## encoding: [0xb8,0x07,0x00,0x00,0x00]
	; SKX-NEXT: movl $7, %edx ## encoding: [0xba,0x07,0x00,0x00,0x00]			; SKX-NEXT: movl $7, %edx ## encoding: [0xba,0x07,0x00,0x00,0x00]
	; SKX-NEXT: vpcmpestri $7, (%ecx), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x61,0x01,0x07]			; SKX-NEXT: vpcmpestri $7, (%ecx), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x61,0x01,0x07]
	; SKX-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]			; SKX-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%1 = load <16 x i8>, <16 x i8>* %a0			%1 = load <16 x i8>, <16 x i8>* %a0
	%2 = load <16 x i8>, <16 x i8>* %a2			%2 = load <16 x i8>, <16 x i8>* %a2
	%res = call i32 @llvm.x86.sse42.pcmpestri128(<16 x i8> %1, i32 7, <16 x i8> %2, i32 7, i8 7) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse42.pcmpestri128(<16 x i8> %1, i32 7, <16 x i8> %2, i32 7, i8 7) ; <i32> [#uses=1]
	▲ Show 20 Lines • Show All 223 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpcmpistri $7, (%eax), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x63,0x00,0x07]			; AVX2-NEXT: vpcmpistri $7, (%eax), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x63,0x00,0x07]
	; AVX2-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]			; AVX2-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_sse42_pcmpistri128_load:			; SKX-LABEL: test_x86_sse42_pcmpistri128_load:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08]			; SKX-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08]
	; SKX-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x04]			; SKX-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x04]
	; SKX-NEXT: vmovdqu8 (%ecx), %xmm0 ## encoding: [0x62,0xf1,0x7f,0x08,0x6f,0x01]			; SKX-NEXT: vmovdqu (%ecx), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x6f,0x01]
	; SKX-NEXT: vpcmpistri $7, (%eax), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x63,0x00,0x07]			; SKX-NEXT: vpcmpistri $7, (%eax), %xmm0 ## encoding: [0xc4,0xe3,0x79,0x63,0x00,0x07]
	; SKX-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]			; SKX-NEXT: movl %ecx, %eax ## encoding: [0x89,0xc8]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%1 = load <16 x i8>, <16 x i8>* %a0			%1 = load <16 x i8>, <16 x i8>* %a0
	%2 = load <16 x i8>, <16 x i8>* %a1			%2 = load <16 x i8>, <16 x i8>* %a1
	%res = call i32 @llvm.x86.sse42.pcmpistri128(<16 x i8> %1, <16 x i8> %2, i8 7) ; <i32> [#uses=1]			%res = call i32 @llvm.x86.sse42.pcmpistri128(<16 x i8> %1, <16 x i8> %2, i8 7) ; <i32> [#uses=1]
	ret i32 %res			ret i32 %res
	}			}
	▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/ssse3-intrinsics-x86.ll

	Show All 10 Lines
	;			;
	; AVX2-LABEL: test_x86_ssse3_pabs_b_128:			; AVX2-LABEL: test_x86_ssse3_pabs_b_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpabsb %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1c,0xc0]			; AVX2-NEXT: vpabsb %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1c,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_ssse3_pabs_b_128:			; SKX-LABEL: test_x86_ssse3_pabs_b_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpabsb %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1c,0xc0]			; SKX-NEXT: vpabsb %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1c,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.ssse3.pabs.b.128(<16 x i8> %a0) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.ssse3.pabs.b.128(<16 x i8> %a0) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.ssse3.pabs.b.128(<16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.ssse3.pabs.b.128(<16 x i8>) nounwind readnone


	define <4 x i32> @test_x86_ssse3_pabs_d_128(<4 x i32> %a0) {			define <4 x i32> @test_x86_ssse3_pabs_d_128(<4 x i32> %a0) {
	; SSE-LABEL: test_x86_ssse3_pabs_d_128:			; SSE-LABEL: test_x86_ssse3_pabs_d_128:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pabsd %xmm0, %xmm0 ## encoding: [0x66,0x0f,0x38,0x1e,0xc0]			; SSE-NEXT: pabsd %xmm0, %xmm0 ## encoding: [0x66,0x0f,0x38,0x1e,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_ssse3_pabs_d_128:			; AVX2-LABEL: test_x86_ssse3_pabs_d_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpabsd %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1e,0xc0]			; AVX2-NEXT: vpabsd %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1e,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_ssse3_pabs_d_128:			; SKX-LABEL: test_x86_ssse3_pabs_d_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpabsd %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1e,0xc0]			; SKX-NEXT: vpabsd %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1e,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32> %a0) ; <<4 x i32>> [#uses=1]			%res = call <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32> %a0) ; <<4 x i32>> [#uses=1]
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}
	declare <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32>) nounwind readnone			declare <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32>) nounwind readnone


	define <8 x i16> @test_x86_ssse3_pabs_w_128(<8 x i16> %a0) {			define <8 x i16> @test_x86_ssse3_pabs_w_128(<8 x i16> %a0) {
	; SSE-LABEL: test_x86_ssse3_pabs_w_128:			; SSE-LABEL: test_x86_ssse3_pabs_w_128:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pabsw %xmm0, %xmm0 ## encoding: [0x66,0x0f,0x38,0x1d,0xc0]			; SSE-NEXT: pabsw %xmm0, %xmm0 ## encoding: [0x66,0x0f,0x38,0x1d,0xc0]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_ssse3_pabs_w_128:			; AVX2-LABEL: test_x86_ssse3_pabs_w_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpabsw %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1d,0xc0]			; AVX2-NEXT: vpabsw %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x1d,0xc0]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_ssse3_pabs_w_128:			; SKX-LABEL: test_x86_ssse3_pabs_w_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpabsw %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x1d,0xc0]			; SKX-NEXT: vpabsw %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x1d,0xc0]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16> %a0) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16> %a0) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16>) nounwind readnone


	define <4 x i32> @test_x86_ssse3_phadd_d_128(<4 x i32> %a0, <4 x i32> %a1) {			define <4 x i32> @test_x86_ssse3_phadd_d_128(<4 x i32> %a0, <4 x i32> %a1) {
	▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines
	;			;
	; AVX2-LABEL: test_x86_ssse3_pmadd_ub_sw_128:			; AVX2-LABEL: test_x86_ssse3_pmadd_ub_sw_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x04,0xc1]			; AVX2-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x04,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_ssse3_pmadd_ub_sw_128:			; SKX-LABEL: test_x86_ssse3_pmadd_ub_sw_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x04,0xc1]			; SKX-NEXT: vpmaddubsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x04,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8> %a0, <16 x i8> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8> %a0, <16 x i8> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>) nounwind readnone			declare <8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>) nounwind readnone


	define <8 x i16> @test_x86_ssse3_pmul_hr_sw_128(<8 x i16> %a0, <8 x i16> %a1) {			define <8 x i16> @test_x86_ssse3_pmul_hr_sw_128(<8 x i16> %a0, <8 x i16> %a1) {
	; SSE-LABEL: test_x86_ssse3_pmul_hr_sw_128:			; SSE-LABEL: test_x86_ssse3_pmul_hr_sw_128:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pmulhrsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x0b,0xc1]			; SSE-NEXT: pmulhrsw %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x0b,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_ssse3_pmul_hr_sw_128:			; AVX2-LABEL: test_x86_ssse3_pmul_hr_sw_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0b,0xc1]			; AVX2-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x0b,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_ssse3_pmul_hr_sw_128:			; SKX-LABEL: test_x86_ssse3_pmul_hr_sw_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x0b,0xc1]			; SKX-NEXT: vpmulhrsw %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x0b,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <8 x i16> @llvm.x86.ssse3.pmul.hr.sw.128(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]			%res = call <8 x i16> @llvm.x86.ssse3.pmul.hr.sw.128(<8 x i16> %a0, <8 x i16> %a1) ; <<8 x i16>> [#uses=1]
	ret <8 x i16> %res			ret <8 x i16> %res
	}			}
	declare <8 x i16> @llvm.x86.ssse3.pmul.hr.sw.128(<8 x i16>, <8 x i16>) nounwind readnone			declare <8 x i16> @llvm.x86.ssse3.pmul.hr.sw.128(<8 x i16>, <8 x i16>) nounwind readnone


	define <16 x i8> @test_x86_ssse3_pshuf_b_128(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_ssse3_pshuf_b_128(<16 x i8> %a0, <16 x i8> %a1) {
	; SSE-LABEL: test_x86_ssse3_pshuf_b_128:			; SSE-LABEL: test_x86_ssse3_pshuf_b_128:
	; SSE: ## BB#0:			; SSE: ## BB#0:
	; SSE-NEXT: pshufb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x00,0xc1]			; SSE-NEXT: pshufb %xmm1, %xmm0 ## encoding: [0x66,0x0f,0x38,0x00,0xc1]
	; SSE-NEXT: retl ## encoding: [0xc3]			; SSE-NEXT: retl ## encoding: [0xc3]
	;			;
	; AVX2-LABEL: test_x86_ssse3_pshuf_b_128:			; AVX2-LABEL: test_x86_ssse3_pshuf_b_128:
	; AVX2: ## BB#0:			; AVX2: ## BB#0:
	; AVX2-NEXT: vpshufb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x00,0xc1]			; AVX2-NEXT: vpshufb %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe2,0x79,0x00,0xc1]
	; AVX2-NEXT: retl ## encoding: [0xc3]			; AVX2-NEXT: retl ## encoding: [0xc3]
	;			;
	; SKX-LABEL: test_x86_ssse3_pshuf_b_128:			; SKX-LABEL: test_x86_ssse3_pshuf_b_128:
	; SKX: ## BB#0:			; SKX: ## BB#0:
	; SKX-NEXT: vpshufb %xmm1, %xmm0, %xmm0 ## encoding: [0x62,0xf2,0x7d,0x08,0x00,0xc1]			; SKX-NEXT: vpshufb %xmm1, %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0x00,0xc1]
	; SKX-NEXT: retl ## encoding: [0xc3]			; SKX-NEXT: retl ## encoding: [0xc3]
	%res = call <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]			%res = call <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8> %a0, <16 x i8> %a1) ; <<16 x i8>> [#uses=1]
	ret <16 x i8> %res			ret <16 x i8> %res
	}			}
	declare <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>) nounwind readnone			declare <16 x i8> @llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>) nounwind readnone


	define <16 x i8> @test_x86_ssse3_psign_b_128(<16 x i8> %a0, <16 x i8> %a1) {			define <16 x i8> @test_x86_ssse3_psign_b_128(<16 x i8> %a0, <16 x i8> %a1) {
	▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/subvector-broadcast.ll

	Show First 20 Lines • Show All 552 Lines • ▼ Show 20 Lines
	; X32-AVX2-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]			; X32-AVX2-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]
	; X32-AVX2-NEXT: vmovaps %ymm0, %ymm1			; X32-AVX2-NEXT: vmovaps %ymm0, %ymm1
	; X32-AVX2-NEXT: retl			; X32-AVX2-NEXT: retl
	;			;
	; X32-AVX512F-LABEL: test_broadcast_8i16_32i16:			; X32-AVX512F-LABEL: test_broadcast_8i16_32i16:
	; X32-AVX512F: ## BB#0:			; X32-AVX512F: ## BB#0:
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512F-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]			; X32-AVX512F-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]
	; X32-AVX512F-NEXT: vmovdqa64 %ymm0, %ymm1			; X32-AVX512F-NEXT: vmovdqa %ymm0, %ymm1
	; X32-AVX512F-NEXT: retl			; X32-AVX512F-NEXT: retl
	;			;
	; X32-AVX512BW-LABEL: test_broadcast_8i16_32i16:			; X32-AVX512BW-LABEL: test_broadcast_8i16_32i16:
	; X32-AVX512BW: ## BB#0:			; X32-AVX512BW: ## BB#0:
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512BW-NEXT: vbroadcasti32x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]			; X32-AVX512BW-NEXT: vbroadcasti32x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]
	; X32-AVX512BW-NEXT: retl			; X32-AVX512BW-NEXT: retl
	;			;
	; X32-AVX512DQ-LABEL: test_broadcast_8i16_32i16:			; X32-AVX512DQ-LABEL: test_broadcast_8i16_32i16:
	; X32-AVX512DQ: ## BB#0:			; X32-AVX512DQ: ## BB#0:
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512DQ-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]			; X32-AVX512DQ-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]
	; X32-AVX512DQ-NEXT: vmovdqa64 %ymm0, %ymm1			; X32-AVX512DQ-NEXT: vmovdqa %ymm0, %ymm1
	; X32-AVX512DQ-NEXT: retl			; X32-AVX512DQ-NEXT: retl
	;			;
	; X64-AVX1-LABEL: test_broadcast_8i16_32i16:			; X64-AVX1-LABEL: test_broadcast_8i16_32i16:
	; X64-AVX1: ## BB#0:			; X64-AVX1: ## BB#0:
	; X64-AVX1-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]			; X64-AVX1-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]
	; X64-AVX1-NEXT: vmovdqa %ymm0, %ymm1			; X64-AVX1-NEXT: vmovdqa %ymm0, %ymm1
	; X64-AVX1-NEXT: retq			; X64-AVX1-NEXT: retq
	;			;
	; X64-AVX2-LABEL: test_broadcast_8i16_32i16:			; X64-AVX2-LABEL: test_broadcast_8i16_32i16:
	; X64-AVX2: ## BB#0:			; X64-AVX2: ## BB#0:
	; X64-AVX2-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]			; X64-AVX2-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]
	; X64-AVX2-NEXT: vmovaps %ymm0, %ymm1			; X64-AVX2-NEXT: vmovaps %ymm0, %ymm1
	; X64-AVX2-NEXT: retq			; X64-AVX2-NEXT: retq
	;			;
	; X64-AVX512F-LABEL: test_broadcast_8i16_32i16:			; X64-AVX512F-LABEL: test_broadcast_8i16_32i16:
	; X64-AVX512F: ## BB#0:			; X64-AVX512F: ## BB#0:
	; X64-AVX512F-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]			; X64-AVX512F-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]
	; X64-AVX512F-NEXT: vmovdqa64 %ymm0, %ymm1			; X64-AVX512F-NEXT: vmovdqa %ymm0, %ymm1
	; X64-AVX512F-NEXT: retq			; X64-AVX512F-NEXT: retq
	;			;
	; X64-AVX512BW-LABEL: test_broadcast_8i16_32i16:			; X64-AVX512BW-LABEL: test_broadcast_8i16_32i16:
	; X64-AVX512BW: ## BB#0:			; X64-AVX512BW: ## BB#0:
	; X64-AVX512BW-NEXT: vbroadcasti32x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]			; X64-AVX512BW-NEXT: vbroadcasti32x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]
	; X64-AVX512BW-NEXT: retq			; X64-AVX512BW-NEXT: retq
	;			;
	; X64-AVX512DQ-LABEL: test_broadcast_8i16_32i16:			; X64-AVX512DQ-LABEL: test_broadcast_8i16_32i16:
	; X64-AVX512DQ: ## BB#0:			; X64-AVX512DQ: ## BB#0:
	; X64-AVX512DQ-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]			; X64-AVX512DQ-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]
	; X64-AVX512DQ-NEXT: vmovdqa64 %ymm0, %ymm1			; X64-AVX512DQ-NEXT: vmovdqa %ymm0, %ymm1
	; X64-AVX512DQ-NEXT: retq			; X64-AVX512DQ-NEXT: retq
	%1 = load <8 x i16>, <8 x i16> *%p			%1 = load <8 x i16>, <8 x i16> *%p
	%2 = shufflevector <8 x i16> %1, <8 x i16> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%2 = shufflevector <8 x i16> %1, <8 x i16> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	ret <32 x i16> %2			ret <32 x i16> %2
	}			}

	define <32 x i16> @test_broadcast_16i16_32i16(<16 x i16> *%p) nounwind {			define <32 x i16> @test_broadcast_16i16_32i16(<16 x i16> *%p) nounwind {
	; X32-AVX-LABEL: test_broadcast_16i16_32i16:			; X32-AVX-LABEL: test_broadcast_16i16_32i16:
	▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	; X32-AVX2-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]			; X32-AVX2-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]
	; X32-AVX2-NEXT: vmovaps %ymm0, %ymm1			; X32-AVX2-NEXT: vmovaps %ymm0, %ymm1
	; X32-AVX2-NEXT: retl			; X32-AVX2-NEXT: retl
	;			;
	; X32-AVX512F-LABEL: test_broadcast_16i8_64i8:			; X32-AVX512F-LABEL: test_broadcast_16i8_64i8:
	; X32-AVX512F: ## BB#0:			; X32-AVX512F: ## BB#0:
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512F-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]			; X32-AVX512F-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]
	; X32-AVX512F-NEXT: vmovdqa64 %ymm0, %ymm1			; X32-AVX512F-NEXT: vmovdqa %ymm0, %ymm1
	; X32-AVX512F-NEXT: retl			; X32-AVX512F-NEXT: retl
	;			;
	; X32-AVX512BW-LABEL: test_broadcast_16i8_64i8:			; X32-AVX512BW-LABEL: test_broadcast_16i8_64i8:
	; X32-AVX512BW: ## BB#0:			; X32-AVX512BW: ## BB#0:
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512BW-NEXT: vbroadcasti32x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]			; X32-AVX512BW-NEXT: vbroadcasti32x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]
	; X32-AVX512BW-NEXT: retl			; X32-AVX512BW-NEXT: retl
	;			;
	; X32-AVX512DQ-LABEL: test_broadcast_16i8_64i8:			; X32-AVX512DQ-LABEL: test_broadcast_16i8_64i8:
	; X32-AVX512DQ: ## BB#0:			; X32-AVX512DQ: ## BB#0:
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512DQ-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]			; X32-AVX512DQ-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]
	; X32-AVX512DQ-NEXT: vmovdqa64 %ymm0, %ymm1			; X32-AVX512DQ-NEXT: vmovdqa %ymm0, %ymm1
	; X32-AVX512DQ-NEXT: retl			; X32-AVX512DQ-NEXT: retl
	;			;
	; X64-AVX1-LABEL: test_broadcast_16i8_64i8:			; X64-AVX1-LABEL: test_broadcast_16i8_64i8:
	; X64-AVX1: ## BB#0:			; X64-AVX1: ## BB#0:
	; X64-AVX1-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]			; X64-AVX1-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]
	; X64-AVX1-NEXT: vmovdqa %ymm0, %ymm1			; X64-AVX1-NEXT: vmovdqa %ymm0, %ymm1
	; X64-AVX1-NEXT: retq			; X64-AVX1-NEXT: retq
	;			;
	; X64-AVX2-LABEL: test_broadcast_16i8_64i8:			; X64-AVX2-LABEL: test_broadcast_16i8_64i8:
	; X64-AVX2: ## BB#0:			; X64-AVX2: ## BB#0:
	; X64-AVX2-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]			; X64-AVX2-NEXT: vbroadcastf128 {{.*#+}} ymm0 = mem[0,1,0,1]
	; X64-AVX2-NEXT: vmovaps %ymm0, %ymm1			; X64-AVX2-NEXT: vmovaps %ymm0, %ymm1
	; X64-AVX2-NEXT: retq			; X64-AVX2-NEXT: retq
	;			;
	; X64-AVX512F-LABEL: test_broadcast_16i8_64i8:			; X64-AVX512F-LABEL: test_broadcast_16i8_64i8:
	; X64-AVX512F: ## BB#0:			; X64-AVX512F: ## BB#0:
	; X64-AVX512F-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]			; X64-AVX512F-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]
	; X64-AVX512F-NEXT: vmovdqa64 %ymm0, %ymm1			; X64-AVX512F-NEXT: vmovdqa %ymm0, %ymm1
	; X64-AVX512F-NEXT: retq			; X64-AVX512F-NEXT: retq
	;			;
	; X64-AVX512BW-LABEL: test_broadcast_16i8_64i8:			; X64-AVX512BW-LABEL: test_broadcast_16i8_64i8:
	; X64-AVX512BW: ## BB#0:			; X64-AVX512BW: ## BB#0:
	; X64-AVX512BW-NEXT: vbroadcasti32x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]			; X64-AVX512BW-NEXT: vbroadcasti32x4 {{.*#+}} zmm0 = mem[0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3]
	; X64-AVX512BW-NEXT: retq			; X64-AVX512BW-NEXT: retq
	;			;
	; X64-AVX512DQ-LABEL: test_broadcast_16i8_64i8:			; X64-AVX512DQ-LABEL: test_broadcast_16i8_64i8:
	; X64-AVX512DQ: ## BB#0:			; X64-AVX512DQ: ## BB#0:
	; X64-AVX512DQ-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]			; X64-AVX512DQ-NEXT: vbroadcasti32x4 {{.*#+}} ymm0 = mem[0,1,2,3,0,1,2,3]
	; X64-AVX512DQ-NEXT: vmovdqa64 %ymm0, %ymm1			; X64-AVX512DQ-NEXT: vmovdqa %ymm0, %ymm1
	; X64-AVX512DQ-NEXT: retq			; X64-AVX512DQ-NEXT: retq
	%1 = load <16 x i8>, <16 x i8> *%p			%1 = load <16 x i8>, <16 x i8> *%p
	%2 = shufflevector <16 x i8> %1, <16 x i8> undef, <64 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>			%2 = shufflevector <16 x i8> %1, <16 x i8> undef, <64 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
	ret <64 x i8> %2			ret <64 x i8> %2
	}			}

	define <64 x i8> @test_broadcast_32i8_64i8(<32 x i8> *%p) nounwind {			define <64 x i8> @test_broadcast_32i8_64i8(<32 x i8> *%p) nounwind {
	; X32-AVX-LABEL: test_broadcast_32i8_64i8:			; X32-AVX-LABEL: test_broadcast_32i8_64i8:
	▲ Show 20 Lines • Show All 133 Lines • ▼ Show 20 Lines
	; X32-AVX-NEXT: vmovaps %xmm0, (%eax)			; X32-AVX-NEXT: vmovaps %xmm0, (%eax)
	; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX-NEXT: retl			; X32-AVX-NEXT: retl
	;			;
	; X32-AVX512F-LABEL: test_broadcast_2i64_4i64_reuse:			; X32-AVX512F-LABEL: test_broadcast_2i64_4i64_reuse:
	; X32-AVX512F: ## BB#0:			; X32-AVX512F: ## BB#0:
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512F-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512F-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512F-NEXT: vmovdqa64 %xmm0, (%eax)			; X32-AVX512F-NEXT: vmovdqa %xmm0, (%eax)
	; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512F-NEXT: retl			; X32-AVX512F-NEXT: retl
	;			;
	; X32-AVX512BW-LABEL: test_broadcast_2i64_4i64_reuse:			; X32-AVX512BW-LABEL: test_broadcast_2i64_4i64_reuse:
	; X32-AVX512BW: ## BB#0:			; X32-AVX512BW: ## BB#0:
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512BW-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512BW-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512BW-NEXT: vmovdqa64 %xmm0, (%eax)			; X32-AVX512BW-NEXT: vmovdqa %xmm0, (%eax)
	; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512BW-NEXT: retl			; X32-AVX512BW-NEXT: retl
	;			;
	; X32-AVX512DQ-LABEL: test_broadcast_2i64_4i64_reuse:			; X32-AVX512DQ-LABEL: test_broadcast_2i64_4i64_reuse:
	; X32-AVX512DQ: ## BB#0:			; X32-AVX512DQ: ## BB#0:
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512DQ-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512DQ-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512DQ-NEXT: vmovdqa64 %xmm0, (%eax)			; X32-AVX512DQ-NEXT: vmovdqa %xmm0, (%eax)
	; X32-AVX512DQ-NEXT: vinserti64x2 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512DQ-NEXT: vinserti64x2 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512DQ-NEXT: retl			; X32-AVX512DQ-NEXT: retl
	;			;
	; X64-AVX-LABEL: test_broadcast_2i64_4i64_reuse:			; X64-AVX-LABEL: test_broadcast_2i64_4i64_reuse:
	; X64-AVX: ## BB#0:			; X64-AVX: ## BB#0:
	; X64-AVX-NEXT: vmovaps (%rdi), %xmm0			; X64-AVX-NEXT: vmovaps (%rdi), %xmm0
	; X64-AVX-NEXT: vmovaps %xmm0, (%rsi)			; X64-AVX-NEXT: vmovaps %xmm0, (%rsi)
	; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX-NEXT: retq			; X64-AVX-NEXT: retq
	;			;
	; X64-AVX512F-LABEL: test_broadcast_2i64_4i64_reuse:			; X64-AVX512F-LABEL: test_broadcast_2i64_4i64_reuse:
	; X64-AVX512F: ## BB#0:			; X64-AVX512F: ## BB#0:
	; X64-AVX512F-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512F-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512F-NEXT: vmovdqa64 %xmm0, (%rsi)			; X64-AVX512F-NEXT: vmovdqa %xmm0, (%rsi)
	; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512F-NEXT: retq			; X64-AVX512F-NEXT: retq
	;			;
	; X64-AVX512BW-LABEL: test_broadcast_2i64_4i64_reuse:			; X64-AVX512BW-LABEL: test_broadcast_2i64_4i64_reuse:
	; X64-AVX512BW: ## BB#0:			; X64-AVX512BW: ## BB#0:
	; X64-AVX512BW-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512BW-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512BW-NEXT: vmovdqa64 %xmm0, (%rsi)			; X64-AVX512BW-NEXT: vmovdqa %xmm0, (%rsi)
	; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512BW-NEXT: retq			; X64-AVX512BW-NEXT: retq
	;			;
	; X64-AVX512DQ-LABEL: test_broadcast_2i64_4i64_reuse:			; X64-AVX512DQ-LABEL: test_broadcast_2i64_4i64_reuse:
	; X64-AVX512DQ: ## BB#0:			; X64-AVX512DQ: ## BB#0:
	; X64-AVX512DQ-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512DQ-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512DQ-NEXT: vmovdqa64 %xmm0, (%rsi)			; X64-AVX512DQ-NEXT: vmovdqa %xmm0, (%rsi)
	; X64-AVX512DQ-NEXT: vinserti64x2 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512DQ-NEXT: vinserti64x2 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512DQ-NEXT: retq			; X64-AVX512DQ-NEXT: retq
	%1 = load <2 x i64>, <2 x i64>* %p0			%1 = load <2 x i64>, <2 x i64>* %p0
	store <2 x i64> %1, <2 x i64>* %p1			store <2 x i64> %1, <2 x i64>* %p1
	%2 = shufflevector <2 x i64> %1, <2 x i64> undef, <4 x i32> <i32 0, i32 1, i32 0, i32 1>			%2 = shufflevector <2 x i64> %1, <2 x i64> undef, <4 x i32> <i32 0, i32 1, i32 0, i32 1>
	ret <4 x i64> %2			ret <4 x i64> %2
	}			}

	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; X32-AVX-NEXT: vmovaps %xmm0, (%eax)			; X32-AVX-NEXT: vmovaps %xmm0, (%eax)
	; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX-NEXT: retl			; X32-AVX-NEXT: retl
	;			;
	; X32-AVX512-LABEL: test_broadcast_4i32_8i32_reuse:			; X32-AVX512-LABEL: test_broadcast_4i32_8i32_reuse:
	; X32-AVX512: ## BB#0:			; X32-AVX512: ## BB#0:
	; X32-AVX512-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512-NEXT: vmovdqa32 (%ecx), %xmm0			; X32-AVX512-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512-NEXT: vmovdqa32 %xmm0, (%eax)			; X32-AVX512-NEXT: vmovdqa %xmm0, (%eax)
	; X32-AVX512-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512-NEXT: retl			; X32-AVX512-NEXT: retl
	;			;
	; X64-AVX-LABEL: test_broadcast_4i32_8i32_reuse:			; X64-AVX-LABEL: test_broadcast_4i32_8i32_reuse:
	; X64-AVX: ## BB#0:			; X64-AVX: ## BB#0:
	; X64-AVX-NEXT: vmovaps (%rdi), %xmm0			; X64-AVX-NEXT: vmovaps (%rdi), %xmm0
	; X64-AVX-NEXT: vmovaps %xmm0, (%rsi)			; X64-AVX-NEXT: vmovaps %xmm0, (%rsi)
	; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX-NEXT: retq			; X64-AVX-NEXT: retq
	;			;
	; X64-AVX512-LABEL: test_broadcast_4i32_8i32_reuse:			; X64-AVX512-LABEL: test_broadcast_4i32_8i32_reuse:
	; X64-AVX512: ## BB#0:			; X64-AVX512: ## BB#0:
	; X64-AVX512-NEXT: vmovdqa32 (%rdi), %xmm0			; X64-AVX512-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512-NEXT: vmovdqa32 %xmm0, (%rsi)			; X64-AVX512-NEXT: vmovdqa %xmm0, (%rsi)
	; X64-AVX512-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512-NEXT: retq			; X64-AVX512-NEXT: retq
	%1 = load <4 x i32>, <4 x i32>* %p0			%1 = load <4 x i32>, <4 x i32>* %p0
	store <4 x i32> %1, <4 x i32>* %p1			store <4 x i32> %1, <4 x i32>* %p1
	%2 = shufflevector <4 x i32> %1, <4 x i32> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>			%2 = shufflevector <4 x i32> %1, <4 x i32> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
	ret <8 x i32> %2			ret <8 x i32> %2
	}			}

	define <16 x i16> @test_broadcast_8i16_16i16_reuse(<8 x i16> %p0, <8 x i16> %p1) nounwind {			define <16 x i16> @test_broadcast_8i16_16i16_reuse(<8 x i16> %p0, <8 x i16> %p1) nounwind {
	; X32-AVX-LABEL: test_broadcast_8i16_16i16_reuse:			; X32-AVX-LABEL: test_broadcast_8i16_16i16_reuse:
	; X32-AVX: ## BB#0:			; X32-AVX: ## BB#0:
	; X32-AVX-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX-NEXT: vmovaps (%ecx), %xmm0			; X32-AVX-NEXT: vmovaps (%ecx), %xmm0
	; X32-AVX-NEXT: vmovaps %xmm0, (%eax)			; X32-AVX-NEXT: vmovaps %xmm0, (%eax)
	; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX-NEXT: retl			; X32-AVX-NEXT: retl
	;			;
	; X32-AVX512F-LABEL: test_broadcast_8i16_16i16_reuse:			; X32-AVX512F-LABEL: test_broadcast_8i16_16i16_reuse:
	; X32-AVX512F: ## BB#0:			; X32-AVX512F: ## BB#0:
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512F-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512F-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512F-NEXT: vmovdqa32 %xmm0, (%eax)			; X32-AVX512F-NEXT: vmovdqa %xmm0, (%eax)
	; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512F-NEXT: retl			; X32-AVX512F-NEXT: retl
	;			;
	; X32-AVX512BW-LABEL: test_broadcast_8i16_16i16_reuse:			; X32-AVX512BW-LABEL: test_broadcast_8i16_16i16_reuse:
	; X32-AVX512BW: ## BB#0:			; X32-AVX512BW: ## BB#0:
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512BW-NEXT: vmovdqu16 (%ecx), %xmm0			; X32-AVX512BW-NEXT: vmovdqu (%ecx), %xmm0
	; X32-AVX512BW-NEXT: vmovdqu16 %xmm0, (%eax)			; X32-AVX512BW-NEXT: vmovdqu %xmm0, (%eax)
	; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512BW-NEXT: retl			; X32-AVX512BW-NEXT: retl
	;			;
	; X32-AVX512DQ-LABEL: test_broadcast_8i16_16i16_reuse:			; X32-AVX512DQ-LABEL: test_broadcast_8i16_16i16_reuse:
	; X32-AVX512DQ: ## BB#0:			; X32-AVX512DQ: ## BB#0:
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512DQ-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512DQ-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512DQ-NEXT: vmovdqa32 %xmm0, (%eax)			; X32-AVX512DQ-NEXT: vmovdqa %xmm0, (%eax)
	; X32-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512DQ-NEXT: retl			; X32-AVX512DQ-NEXT: retl
	;			;
	; X64-AVX-LABEL: test_broadcast_8i16_16i16_reuse:			; X64-AVX-LABEL: test_broadcast_8i16_16i16_reuse:
	; X64-AVX: ## BB#0:			; X64-AVX: ## BB#0:
	; X64-AVX-NEXT: vmovaps (%rdi), %xmm0			; X64-AVX-NEXT: vmovaps (%rdi), %xmm0
	; X64-AVX-NEXT: vmovaps %xmm0, (%rsi)			; X64-AVX-NEXT: vmovaps %xmm0, (%rsi)
	; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX-NEXT: retq			; X64-AVX-NEXT: retq
	;			;
	; X64-AVX512F-LABEL: test_broadcast_8i16_16i16_reuse:			; X64-AVX512F-LABEL: test_broadcast_8i16_16i16_reuse:
	; X64-AVX512F: ## BB#0:			; X64-AVX512F: ## BB#0:
	; X64-AVX512F-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512F-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512F-NEXT: vmovdqa32 %xmm0, (%rsi)			; X64-AVX512F-NEXT: vmovdqa %xmm0, (%rsi)
	; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512F-NEXT: retq			; X64-AVX512F-NEXT: retq
	;			;
	; X64-AVX512BW-LABEL: test_broadcast_8i16_16i16_reuse:			; X64-AVX512BW-LABEL: test_broadcast_8i16_16i16_reuse:
	; X64-AVX512BW: ## BB#0:			; X64-AVX512BW: ## BB#0:
	; X64-AVX512BW-NEXT: vmovdqu16 (%rdi), %xmm0			; X64-AVX512BW-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX512BW-NEXT: vmovdqu16 %xmm0, (%rsi)			; X64-AVX512BW-NEXT: vmovdqu %xmm0, (%rsi)
	; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512BW-NEXT: retq			; X64-AVX512BW-NEXT: retq
	;			;
	; X64-AVX512DQ-LABEL: test_broadcast_8i16_16i16_reuse:			; X64-AVX512DQ-LABEL: test_broadcast_8i16_16i16_reuse:
	; X64-AVX512DQ: ## BB#0:			; X64-AVX512DQ: ## BB#0:
	; X64-AVX512DQ-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512DQ-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512DQ-NEXT: vmovdqa32 %xmm0, (%rsi)			; X64-AVX512DQ-NEXT: vmovdqa %xmm0, (%rsi)
	; X64-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512DQ-NEXT: retq			; X64-AVX512DQ-NEXT: retq
	%1 = load <8 x i16>, <8 x i16> *%p0			%1 = load <8 x i16>, <8 x i16> *%p0
	store <8 x i16> %1, <8 x i16>* %p1			store <8 x i16> %1, <8 x i16>* %p1
	%2 = shufflevector <8 x i16> %1, <8 x i16> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%2 = shufflevector <8 x i16> %1, <8 x i16> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	ret <16 x i16> %2			ret <16 x i16> %2
	}			}

	define <32 x i8> @test_broadcast_16i8_32i8_reuse(<16 x i8> %p0, <16 x i8> %p1) nounwind {			define <32 x i8> @test_broadcast_16i8_32i8_reuse(<16 x i8> %p0, <16 x i8> %p1) nounwind {
	; X32-AVX-LABEL: test_broadcast_16i8_32i8_reuse:			; X32-AVX-LABEL: test_broadcast_16i8_32i8_reuse:
	; X32-AVX: ## BB#0:			; X32-AVX: ## BB#0:
	; X32-AVX-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX-NEXT: vmovaps (%ecx), %xmm0			; X32-AVX-NEXT: vmovaps (%ecx), %xmm0
	; X32-AVX-NEXT: vmovaps %xmm0, (%eax)			; X32-AVX-NEXT: vmovaps %xmm0, (%eax)
	; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX-NEXT: retl			; X32-AVX-NEXT: retl
	;			;
	; X32-AVX512F-LABEL: test_broadcast_16i8_32i8_reuse:			; X32-AVX512F-LABEL: test_broadcast_16i8_32i8_reuse:
	; X32-AVX512F: ## BB#0:			; X32-AVX512F: ## BB#0:
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512F-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512F-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512F-NEXT: vmovdqa32 %xmm0, (%eax)			; X32-AVX512F-NEXT: vmovdqa %xmm0, (%eax)
	; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512F-NEXT: retl			; X32-AVX512F-NEXT: retl
	;			;
	; X32-AVX512BW-LABEL: test_broadcast_16i8_32i8_reuse:			; X32-AVX512BW-LABEL: test_broadcast_16i8_32i8_reuse:
	; X32-AVX512BW: ## BB#0:			; X32-AVX512BW: ## BB#0:
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512BW-NEXT: vmovdqu8 (%ecx), %xmm0			; X32-AVX512BW-NEXT: vmovdqu (%ecx), %xmm0
	; X32-AVX512BW-NEXT: vmovdqu8 %xmm0, (%eax)			; X32-AVX512BW-NEXT: vmovdqu %xmm0, (%eax)
	; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512BW-NEXT: retl			; X32-AVX512BW-NEXT: retl
	;			;
	; X32-AVX512DQ-LABEL: test_broadcast_16i8_32i8_reuse:			; X32-AVX512DQ-LABEL: test_broadcast_16i8_32i8_reuse:
	; X32-AVX512DQ: ## BB#0:			; X32-AVX512DQ: ## BB#0:
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512DQ-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512DQ-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512DQ-NEXT: vmovdqa32 %xmm0, (%eax)			; X32-AVX512DQ-NEXT: vmovdqa %xmm0, (%eax)
	; X32-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512DQ-NEXT: retl			; X32-AVX512DQ-NEXT: retl
	;			;
	; X64-AVX-LABEL: test_broadcast_16i8_32i8_reuse:			; X64-AVX-LABEL: test_broadcast_16i8_32i8_reuse:
	; X64-AVX: ## BB#0:			; X64-AVX: ## BB#0:
	; X64-AVX-NEXT: vmovaps (%rdi), %xmm0			; X64-AVX-NEXT: vmovaps (%rdi), %xmm0
	; X64-AVX-NEXT: vmovaps %xmm0, (%rsi)			; X64-AVX-NEXT: vmovaps %xmm0, (%rsi)
	; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX-NEXT: retq			; X64-AVX-NEXT: retq
	;			;
	; X64-AVX512F-LABEL: test_broadcast_16i8_32i8_reuse:			; X64-AVX512F-LABEL: test_broadcast_16i8_32i8_reuse:
	; X64-AVX512F: ## BB#0:			; X64-AVX512F: ## BB#0:
	; X64-AVX512F-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512F-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512F-NEXT: vmovdqa32 %xmm0, (%rsi)			; X64-AVX512F-NEXT: vmovdqa %xmm0, (%rsi)
	; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512F-NEXT: retq			; X64-AVX512F-NEXT: retq
	;			;
	; X64-AVX512BW-LABEL: test_broadcast_16i8_32i8_reuse:			; X64-AVX512BW-LABEL: test_broadcast_16i8_32i8_reuse:
	; X64-AVX512BW: ## BB#0:			; X64-AVX512BW: ## BB#0:
	; X64-AVX512BW-NEXT: vmovdqu8 (%rdi), %xmm0			; X64-AVX512BW-NEXT: vmovdqu (%rdi), %xmm0
	; X64-AVX512BW-NEXT: vmovdqu8 %xmm0, (%rsi)			; X64-AVX512BW-NEXT: vmovdqu %xmm0, (%rsi)
	; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512BW-NEXT: retq			; X64-AVX512BW-NEXT: retq
	;			;
	; X64-AVX512DQ-LABEL: test_broadcast_16i8_32i8_reuse:			; X64-AVX512DQ-LABEL: test_broadcast_16i8_32i8_reuse:
	; X64-AVX512DQ: ## BB#0:			; X64-AVX512DQ: ## BB#0:
	; X64-AVX512DQ-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512DQ-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512DQ-NEXT: vmovdqa32 %xmm0, (%rsi)			; X64-AVX512DQ-NEXT: vmovdqa %xmm0, (%rsi)
	; X64-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512DQ-NEXT: retq			; X64-AVX512DQ-NEXT: retq
	%1 = load <16 x i8>, <16 x i8> *%p0			%1 = load <16 x i8>, <16 x i8> *%p0
	store <16 x i8> %1, <16 x i8>* %p1			store <16 x i8> %1, <16 x i8>* %p1
	%2 = shufflevector <16 x i8> %1, <16 x i8> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>			%2 = shufflevector <16 x i8> %1, <16 x i8> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
	ret <32 x i8> %2			ret <32 x i8> %2
	}			}

	Show All 11 Lines
	; X32-AVX-NEXT: vmovaps %xmm1, (%eax)			; X32-AVX-NEXT: vmovaps %xmm1, (%eax)
	; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX-NEXT: retl			; X32-AVX-NEXT: retl
	;			;
	; X32-AVX512F-LABEL: test_broadcast_4i32_8i32_chain:			; X32-AVX512F-LABEL: test_broadcast_4i32_8i32_chain:
	; X32-AVX512F: ## BB#0:			; X32-AVX512F: ## BB#0:
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512F-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512F-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512F-NEXT: vpxord %xmm1, %xmm1, %xmm1			; X32-AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; X32-AVX512F-NEXT: vmovdqa32 %xmm1, (%eax)			; X32-AVX512F-NEXT: vmovdqa %xmm1, (%eax)
	; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512F-NEXT: retl			; X32-AVX512F-NEXT: retl
	;			;
	; X32-AVX512BW-LABEL: test_broadcast_4i32_8i32_chain:			; X32-AVX512BW-LABEL: test_broadcast_4i32_8i32_chain:
	; X32-AVX512BW: ## BB#0:			; X32-AVX512BW: ## BB#0:
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512BW-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512BW-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512BW-NEXT: vpxord %xmm1, %xmm1, %xmm1			; X32-AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; X32-AVX512BW-NEXT: vmovdqa32 %xmm1, (%eax)			; X32-AVX512BW-NEXT: vmovdqa %xmm1, (%eax)
	; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512BW-NEXT: retl			; X32-AVX512BW-NEXT: retl
	;			;
	; X32-AVX512DQ-LABEL: test_broadcast_4i32_8i32_chain:			; X32-AVX512DQ-LABEL: test_broadcast_4i32_8i32_chain:
	; X32-AVX512DQ: ## BB#0:			; X32-AVX512DQ: ## BB#0:
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512DQ-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512DQ-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1			; X32-AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1
	; X32-AVX512DQ-NEXT: vmovaps %xmm1, (%eax)			; X32-AVX512DQ-NEXT: vmovaps %xmm1, (%eax)
	; X32-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X32-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX512DQ-NEXT: retl			; X32-AVX512DQ-NEXT: retl
	;			;
	; X64-AVX-LABEL: test_broadcast_4i32_8i32_chain:			; X64-AVX-LABEL: test_broadcast_4i32_8i32_chain:
	; X64-AVX: ## BB#0:			; X64-AVX: ## BB#0:
	; X64-AVX-NEXT: vmovaps (%rdi), %xmm0			; X64-AVX-NEXT: vmovaps (%rdi), %xmm0
	; X64-AVX-NEXT: vxorps %xmm1, %xmm1, %xmm1			; X64-AVX-NEXT: vxorps %xmm1, %xmm1, %xmm1
	; X64-AVX-NEXT: vmovaps %xmm1, (%rsi)			; X64-AVX-NEXT: vmovaps %xmm1, (%rsi)
	; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX-NEXT: retq			; X64-AVX-NEXT: retq
	;			;
	; X64-AVX512F-LABEL: test_broadcast_4i32_8i32_chain:			; X64-AVX512F-LABEL: test_broadcast_4i32_8i32_chain:
	; X64-AVX512F: ## BB#0:			; X64-AVX512F: ## BB#0:
	; X64-AVX512F-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512F-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512F-NEXT: vpxord %xmm1, %xmm1, %xmm1			; X64-AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; X64-AVX512F-NEXT: vmovdqa32 %xmm1, (%rsi)			; X64-AVX512F-NEXT: vmovdqa %xmm1, (%rsi)
	; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512F-NEXT: retq			; X64-AVX512F-NEXT: retq
	;			;
	; X64-AVX512BW-LABEL: test_broadcast_4i32_8i32_chain:			; X64-AVX512BW-LABEL: test_broadcast_4i32_8i32_chain:
	; X64-AVX512BW: ## BB#0:			; X64-AVX512BW: ## BB#0:
	; X64-AVX512BW-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512BW-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512BW-NEXT: vpxord %xmm1, %xmm1, %xmm1			; X64-AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; X64-AVX512BW-NEXT: vmovdqa32 %xmm1, (%rsi)			; X64-AVX512BW-NEXT: vmovdqa %xmm1, (%rsi)
	; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512BW-NEXT: retq			; X64-AVX512BW-NEXT: retq
	;			;
	; X64-AVX512DQ-LABEL: test_broadcast_4i32_8i32_chain:			; X64-AVX512DQ-LABEL: test_broadcast_4i32_8i32_chain:
	; X64-AVX512DQ: ## BB#0:			; X64-AVX512DQ: ## BB#0:
	; X64-AVX512DQ-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512DQ-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1			; X64-AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1
	; X64-AVX512DQ-NEXT: vmovaps %xmm1, (%rsi)			; X64-AVX512DQ-NEXT: vmovaps %xmm1, (%rsi)
	; X64-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0			; X64-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX512DQ-NEXT: retq			; X64-AVX512DQ-NEXT: retq
	%1 = load <4 x i32>, <4 x i32>* %p0			%1 = load <4 x i32>, <4 x i32>* %p0
	store <4 x float> zeroinitializer, <4 x float>* %p1			store <4 x float> zeroinitializer, <4 x float>* %p1
	%2 = shufflevector <4 x i32> %1, <4 x i32> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>			%2 = shufflevector <4 x i32> %1, <4 x i32> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
	ret <8 x i32> %2			ret <8 x i32> %2
	Show All 10 Lines
	; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X32-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X32-AVX-NEXT: vmovaps %ymm0, %ymm1			; X32-AVX-NEXT: vmovaps %ymm0, %ymm1
	; X32-AVX-NEXT: retl			; X32-AVX-NEXT: retl
	;			;
	; X32-AVX512F-LABEL: test_broadcast_4i32_16i32_chain:			; X32-AVX512F-LABEL: test_broadcast_4i32_16i32_chain:
	; X32-AVX512F: ## BB#0:			; X32-AVX512F: ## BB#0:
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512F-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512F-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512F-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512F-NEXT: vpxord %xmm1, %xmm1, %xmm1			; X32-AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; X32-AVX512F-NEXT: vmovdqa32 %xmm1, (%eax)			; X32-AVX512F-NEXT: vmovdqa %xmm1, (%eax)
	; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0			; X32-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0
	; X32-AVX512F-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0			; X32-AVX512F-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0
	; X32-AVX512F-NEXT: retl			; X32-AVX512F-NEXT: retl
	;			;
	; X32-AVX512BW-LABEL: test_broadcast_4i32_16i32_chain:			; X32-AVX512BW-LABEL: test_broadcast_4i32_16i32_chain:
	; X32-AVX512BW: ## BB#0:			; X32-AVX512BW: ## BB#0:
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512BW-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512BW-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512BW-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512BW-NEXT: vpxord %xmm1, %xmm1, %xmm1			; X32-AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; X32-AVX512BW-NEXT: vmovdqa32 %xmm1, (%eax)			; X32-AVX512BW-NEXT: vmovdqa %xmm1, (%eax)
	; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0			; X32-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0
	; X32-AVX512BW-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0			; X32-AVX512BW-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0
	; X32-AVX512BW-NEXT: retl			; X32-AVX512BW-NEXT: retl
	;			;
	; X32-AVX512DQ-LABEL: test_broadcast_4i32_16i32_chain:			; X32-AVX512DQ-LABEL: test_broadcast_4i32_16i32_chain:
	; X32-AVX512DQ: ## BB#0:			; X32-AVX512DQ: ## BB#0:
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-AVX512DQ-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-AVX512DQ-NEXT: vmovdqa64 (%ecx), %xmm0			; X32-AVX512DQ-NEXT: vmovdqa (%ecx), %xmm0
	; X32-AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1			; X32-AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1
	; X32-AVX512DQ-NEXT: vmovaps %xmm1, (%eax)			; X32-AVX512DQ-NEXT: vmovaps %xmm1, (%eax)
	; X32-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0			; X32-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0
	; X32-AVX512DQ-NEXT: vinserti32x8 $1, %ymm0, %zmm0, %zmm0			; X32-AVX512DQ-NEXT: vinserti32x8 $1, %ymm0, %zmm0, %zmm0
	; X32-AVX512DQ-NEXT: retl			; X32-AVX512DQ-NEXT: retl
	;			;
	; X64-AVX-LABEL: test_broadcast_4i32_16i32_chain:			; X64-AVX-LABEL: test_broadcast_4i32_16i32_chain:
	; X64-AVX: ## BB#0:			; X64-AVX: ## BB#0:
	; X64-AVX-NEXT: vmovaps (%rdi), %xmm0			; X64-AVX-NEXT: vmovaps (%rdi), %xmm0
	; X64-AVX-NEXT: vxorps %xmm1, %xmm1, %xmm1			; X64-AVX-NEXT: vxorps %xmm1, %xmm1, %xmm1
	; X64-AVX-NEXT: vmovaps %xmm1, (%rsi)			; X64-AVX-NEXT: vmovaps %xmm1, (%rsi)
	; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0			; X64-AVX-NEXT: vinsertf128 $1, %xmm0, %ymm0, %ymm0
	; X64-AVX-NEXT: vmovaps %ymm0, %ymm1			; X64-AVX-NEXT: vmovaps %ymm0, %ymm1
	; X64-AVX-NEXT: retq			; X64-AVX-NEXT: retq
	;			;
	; X64-AVX512F-LABEL: test_broadcast_4i32_16i32_chain:			; X64-AVX512F-LABEL: test_broadcast_4i32_16i32_chain:
	; X64-AVX512F: ## BB#0:			; X64-AVX512F: ## BB#0:
	; X64-AVX512F-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512F-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512F-NEXT: vpxord %xmm1, %xmm1, %xmm1			; X64-AVX512F-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; X64-AVX512F-NEXT: vmovdqa32 %xmm1, (%rsi)			; X64-AVX512F-NEXT: vmovdqa %xmm1, (%rsi)
	; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0			; X64-AVX512F-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0
	; X64-AVX512F-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0			; X64-AVX512F-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0
	; X64-AVX512F-NEXT: retq			; X64-AVX512F-NEXT: retq
	;			;
	; X64-AVX512BW-LABEL: test_broadcast_4i32_16i32_chain:			; X64-AVX512BW-LABEL: test_broadcast_4i32_16i32_chain:
	; X64-AVX512BW: ## BB#0:			; X64-AVX512BW: ## BB#0:
	; X64-AVX512BW-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512BW-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512BW-NEXT: vpxord %xmm1, %xmm1, %xmm1			; X64-AVX512BW-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; X64-AVX512BW-NEXT: vmovdqa32 %xmm1, (%rsi)			; X64-AVX512BW-NEXT: vmovdqa %xmm1, (%rsi)
	; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0			; X64-AVX512BW-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0
	; X64-AVX512BW-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0			; X64-AVX512BW-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0
	; X64-AVX512BW-NEXT: retq			; X64-AVX512BW-NEXT: retq
	;			;
	; X64-AVX512DQ-LABEL: test_broadcast_4i32_16i32_chain:			; X64-AVX512DQ-LABEL: test_broadcast_4i32_16i32_chain:
	; X64-AVX512DQ: ## BB#0:			; X64-AVX512DQ: ## BB#0:
	; X64-AVX512DQ-NEXT: vmovdqa64 (%rdi), %xmm0			; X64-AVX512DQ-NEXT: vmovdqa (%rdi), %xmm0
	; X64-AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1			; X64-AVX512DQ-NEXT: vxorps %xmm1, %xmm1, %xmm1
	; X64-AVX512DQ-NEXT: vmovaps %xmm1, (%rsi)			; X64-AVX512DQ-NEXT: vmovaps %xmm1, (%rsi)
	; X64-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0			; X64-AVX512DQ-NEXT: vinserti32x4 $1, %xmm0, %zmm0, %zmm0
	; X64-AVX512DQ-NEXT: vinserti32x8 $1, %ymm0, %zmm0, %zmm0			; X64-AVX512DQ-NEXT: vinserti32x8 $1, %ymm0, %zmm0, %zmm0
	; X64-AVX512DQ-NEXT: retq			; X64-AVX512DQ-NEXT: retq
	%1 = load <4 x i32>, <4 x i32>* %p0			%1 = load <4 x i32>, <4 x i32>* %p0
	store <4 x float> zeroinitializer, <4 x float>* %p1			store <4 x float> zeroinitializer, <4 x float>* %p1
	%2 = shufflevector <4 x i32> %1, <4 x i32> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>			%2 = shufflevector <4 x i32> %1, <4 x i32> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
	Show All 10 Lines

	define void @fallback_broadcast_v4i64_to_v8i64(<4 x i64> %a, <8 x i64> %b) {			define void @fallback_broadcast_v4i64_to_v8i64(<4 x i64> %a, <8 x i64> %b) {
	; X32-AVX512-LABEL: fallback_broadcast_v4i64_to_v8i64:			; X32-AVX512-LABEL: fallback_broadcast_v4i64_to_v8i64:
	; X32-AVX512: ## BB#0: ## %entry			; X32-AVX512: ## BB#0: ## %entry
	; X32-AVX512-NEXT: vpaddq LCPI26_0, %ymm0, %ymm0			; X32-AVX512-NEXT: vpaddq LCPI26_0, %ymm0, %ymm0
	; X32-AVX512-NEXT: vmovdqa64 {{.*#+}} zmm2 = [1,0,2,0,3,0,4,0,1,0,2,0,3,0,4,0]			; X32-AVX512-NEXT: vmovdqa64 {{.*#+}} zmm2 = [1,0,2,0,3,0,4,0,1,0,2,0,3,0,4,0]
	; X32-AVX512-NEXT: vpaddq %zmm2, %zmm1, %zmm1			; X32-AVX512-NEXT: vpaddq %zmm2, %zmm1, %zmm1
	; X32-AVX512-NEXT: vpandq %zmm2, %zmm1, %zmm1			; X32-AVX512-NEXT: vpandq %zmm2, %zmm1, %zmm1
	; X32-AVX512-NEXT: vmovdqu64 %ymm0, _ga4			; X32-AVX512-NEXT: vmovdqu %ymm0, _ga4
	; X32-AVX512-NEXT: vmovdqu64 %zmm1, _gb4			; X32-AVX512-NEXT: vmovdqu64 %zmm1, _gb4
	; X32-AVX512-NEXT: retl			; X32-AVX512-NEXT: retl
	;			;
	; X64-AVX512-LABEL: fallback_broadcast_v4i64_to_v8i64:			; X64-AVX512-LABEL: fallback_broadcast_v4i64_to_v8i64:
	; X64-AVX512: ## BB#0: ## %entry			; X64-AVX512: ## BB#0: ## %entry
	; X64-AVX512-NEXT: vmovdqa64 {{.*#+}} ymm2 = [1,2,3,4]			; X64-AVX512-NEXT: vmovdqa {{.*#+}} ymm2 = [1,2,3,4]
	; X64-AVX512-NEXT: vpaddq %ymm2, %ymm0, %ymm0			; X64-AVX512-NEXT: vpaddq %ymm2, %ymm0, %ymm0
	; X64-AVX512-NEXT: vinserti64x4 $1, %ymm2, %zmm2, %zmm2			; X64-AVX512-NEXT: vinserti64x4 $1, %ymm2, %zmm2, %zmm2
	; X64-AVX512-NEXT: vpaddq %zmm2, %zmm1, %zmm1			; X64-AVX512-NEXT: vpaddq %zmm2, %zmm1, %zmm1
	; X64-AVX512-NEXT: vpandq %zmm2, %zmm1, %zmm1			; X64-AVX512-NEXT: vpandq %zmm2, %zmm1, %zmm1
	; X64-AVX512-NEXT: vmovdqu64 %ymm0, {{.*}}(%rip)			; X64-AVX512-NEXT: vmovdqu %ymm0, {{.*}}(%rip)
	; X64-AVX512-NEXT: vmovdqu64 %zmm1, {{.*}}(%rip)			; X64-AVX512-NEXT: vmovdqu64 %zmm1, {{.*}}(%rip)
	; X64-AVX512-NEXT: retq			; X64-AVX512-NEXT: retq
	entry:			entry:
	%0 = add <4 x i64> %a, <i64 1, i64 2, i64 3, i64 4>			%0 = add <4 x i64> %a, <i64 1, i64 2, i64 3, i64 4>
	%1 = add <8 x i64> %b, <i64 1, i64 2, i64 3, i64 4, i64 1, i64 2, i64 3, i64 4>			%1 = add <8 x i64> %b, <i64 1, i64 2, i64 3, i64 4, i64 1, i64 2, i64 3, i64 4>
	%2 = and <8 x i64> %1, <i64 1, i64 2, i64 3, i64 4, i64 1, i64 2, i64 3, i64 4>			%2 = and <8 x i64> %1, <i64 1, i64 2, i64 3, i64 4, i64 1, i64 2, i64 3, i64 4>
	store <4 x i64> %0, <4 x i64>* @ga4, align 8			store <4 x i64> %0, <4 x i64>* @ga4, align 8
	store <8 x i64> %2, <8 x i64>* @gb4, align 8			store <8 x i64> %2, <8 x i64>* @gb4, align 8
	Show All 37 Lines

llvm/trunk/test/CodeGen/X86/vec-copysign-avx512.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-apple-macosx10.10.0 -mattr=+avx512vl \| FileCheck %s --check-prefix=CHECK --check-prefix=AVX512VL			; RUN: llc < %s -mtriple=x86_64-apple-macosx10.10.0 -mattr=+avx512vl \| FileCheck %s --check-prefix=CHECK --check-prefix=AVX512VL
	; RUN: llc < %s -mtriple=x86_64-apple-macosx10.10.0 -mattr=+avx512vl,+avx512dq \| FileCheck %s --check-prefix=CHECK --check-prefix=AVX512VLDQ			; RUN: llc < %s -mtriple=x86_64-apple-macosx10.10.0 -mattr=+avx512vl,+avx512dq \| FileCheck %s --check-prefix=CHECK --check-prefix=AVX512VLDQ

	define <4 x float> @v4f32(<4 x float> %a, <4 x float> %b) nounwind {			define <4 x float> @v4f32(<4 x float> %a, <4 x float> %b) nounwind {
	; AVX512VL-LABEL: v4f32:			; AVX512VL-LABEL: v4f32:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpandd {{.*}}(%rip){1to4}, %xmm1, %xmm1			; AVX512VL-NEXT: vpandd {{.*}}(%rip){1to4}, %xmm1, %xmm1
	; AVX512VL-NEXT: vpandd {{.*}}(%rip){1to4}, %xmm0, %xmm0			; AVX512VL-NEXT: vpandd {{.*}}(%rip){1to4}, %xmm0, %xmm0
	; AVX512VL-NEXT: vporq %xmm1, %xmm0, %xmm0			; AVX512VL-NEXT: vpor %xmm1, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512VLDQ-LABEL: v4f32:			; AVX512VLDQ-LABEL: v4f32:
	; AVX512VLDQ: ## BB#0:			; AVX512VLDQ: ## BB#0:
	; AVX512VLDQ-NEXT: vandps {{.*}}(%rip){1to4}, %xmm1, %xmm1			; AVX512VLDQ-NEXT: vandps {{.*}}(%rip){1to4}, %xmm1, %xmm1
	; AVX512VLDQ-NEXT: vandps {{.*}}(%rip){1to4}, %xmm0, %xmm0			; AVX512VLDQ-NEXT: vandps {{.*}}(%rip){1to4}, %xmm0, %xmm0
	; AVX512VLDQ-NEXT: vorps %xmm1, %xmm0, %xmm0			; AVX512VLDQ-NEXT: vorps %xmm1, %xmm0, %xmm0
	; AVX512VLDQ-NEXT: retq			; AVX512VLDQ-NEXT: retq
	%tmp = tail call <4 x float> @llvm.copysign.v4f32( <4 x float> %a, <4 x float> %b )			%tmp = tail call <4 x float> @llvm.copysign.v4f32( <4 x float> %a, <4 x float> %b )
	ret <4 x float> %tmp			ret <4 x float> %tmp
	}			}

	define <8 x float> @v8f32(<8 x float> %a, <8 x float> %b) nounwind {			define <8 x float> @v8f32(<8 x float> %a, <8 x float> %b) nounwind {
	; AVX512VL-LABEL: v8f32:			; AVX512VL-LABEL: v8f32:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpandd {{.*}}(%rip){1to8}, %ymm1, %ymm1			; AVX512VL-NEXT: vpandd {{.*}}(%rip){1to8}, %ymm1, %ymm1
	; AVX512VL-NEXT: vpandd {{.*}}(%rip){1to8}, %ymm0, %ymm0			; AVX512VL-NEXT: vpandd {{.*}}(%rip){1to8}, %ymm0, %ymm0
	; AVX512VL-NEXT: vporq %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpor %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512VLDQ-LABEL: v8f32:			; AVX512VLDQ-LABEL: v8f32:
	; AVX512VLDQ: ## BB#0:			; AVX512VLDQ: ## BB#0:
	; AVX512VLDQ-NEXT: vandps {{.*}}(%rip){1to8}, %ymm1, %ymm1			; AVX512VLDQ-NEXT: vandps {{.*}}(%rip){1to8}, %ymm1, %ymm1
	; AVX512VLDQ-NEXT: vandps {{.*}}(%rip){1to8}, %ymm0, %ymm0			; AVX512VLDQ-NEXT: vandps {{.*}}(%rip){1to8}, %ymm0, %ymm0
	; AVX512VLDQ-NEXT: vorps %ymm1, %ymm0, %ymm0			; AVX512VLDQ-NEXT: vorps %ymm1, %ymm0, %ymm0
	; AVX512VLDQ-NEXT: retq			; AVX512VLDQ-NEXT: retq
	Show All 17 Lines
	; AVX512VLDQ-NEXT: retq			; AVX512VLDQ-NEXT: retq
	%tmp = tail call <16 x float> @llvm.copysign.v16f32( <16 x float> %a, <16 x float> %b )			%tmp = tail call <16 x float> @llvm.copysign.v16f32( <16 x float> %a, <16 x float> %b )
	ret <16 x float> %tmp			ret <16 x float> %tmp
	}			}

	define <2 x double> @v2f64(<2 x double> %a, <2 x double> %b) nounwind {			define <2 x double> @v2f64(<2 x double> %a, <2 x double> %b) nounwind {
	; AVX512VL-LABEL: v2f64:			; AVX512VL-LABEL: v2f64:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpandq {{.*}}(%rip), %xmm1, %xmm1			; AVX512VL-NEXT: vpand {{.*}}(%rip), %xmm1, %xmm1
	; AVX512VL-NEXT: vpandq {{.*}}(%rip), %xmm0, %xmm0			; AVX512VL-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0
	; AVX512VL-NEXT: vporq %xmm1, %xmm0, %xmm0			; AVX512VL-NEXT: vpor %xmm1, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512VLDQ-LABEL: v2f64:			; AVX512VLDQ-LABEL: v2f64:
	; AVX512VLDQ: ## BB#0:			; AVX512VLDQ: ## BB#0:
	; AVX512VLDQ-NEXT: vandps {{.*}}(%rip), %xmm1, %xmm1			; AVX512VLDQ-NEXT: vandps {{.*}}(%rip), %xmm1, %xmm1
	; AVX512VLDQ-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0			; AVX512VLDQ-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0
	; AVX512VLDQ-NEXT: vorps %xmm1, %xmm0, %xmm0			; AVX512VLDQ-NEXT: vorps %xmm1, %xmm0, %xmm0
	; AVX512VLDQ-NEXT: retq			; AVX512VLDQ-NEXT: retq
	%tmp = tail call <2 x double> @llvm.copysign.v2f64( <2 x double> %a, <2 x double> %b )			%tmp = tail call <2 x double> @llvm.copysign.v2f64( <2 x double> %a, <2 x double> %b )
	ret <2 x double> %tmp			ret <2 x double> %tmp
	}			}

	define <4 x double> @v4f64(<4 x double> %a, <4 x double> %b) nounwind {			define <4 x double> @v4f64(<4 x double> %a, <4 x double> %b) nounwind {
	; AVX512VL-LABEL: v4f64:			; AVX512VL-LABEL: v4f64:
	; AVX512VL: ## BB#0:			; AVX512VL: ## BB#0:
	; AVX512VL-NEXT: vpandq {{.*}}(%rip){1to4}, %ymm1, %ymm1			; AVX512VL-NEXT: vpandq {{.*}}(%rip){1to4}, %ymm1, %ymm1
	; AVX512VL-NEXT: vpandq {{.*}}(%rip){1to4}, %ymm0, %ymm0			; AVX512VL-NEXT: vpandq {{.*}}(%rip){1to4}, %ymm0, %ymm0
	; AVX512VL-NEXT: vporq %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpor %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512VLDQ-LABEL: v4f64:			; AVX512VLDQ-LABEL: v4f64:
	; AVX512VLDQ: ## BB#0:			; AVX512VLDQ: ## BB#0:
	; AVX512VLDQ-NEXT: vandpd {{.*}}(%rip){1to4}, %ymm1, %ymm1			; AVX512VLDQ-NEXT: vandpd {{.*}}(%rip){1to4}, %ymm1, %ymm1
	; AVX512VLDQ-NEXT: vandpd {{.*}}(%rip){1to4}, %ymm0, %ymm0			; AVX512VLDQ-NEXT: vandpd {{.*}}(%rip){1to4}, %ymm0, %ymm0
	; AVX512VLDQ-NEXT: vorpd %ymm1, %ymm0, %ymm0			; AVX512VLDQ-NEXT: vorpd %ymm1, %ymm0, %ymm0
	; AVX512VLDQ-NEXT: retq			; AVX512VLDQ-NEXT: retq
	Show All 29 Lines

llvm/trunk/test/CodeGen/X86/vec_fabs.ll

	Show All 11 Lines
	define <2 x double> @fabs_v2f64(<2 x double> %p) {			define <2 x double> @fabs_v2f64(<2 x double> %p) {
	; X32_AVX-LABEL: fabs_v2f64:			; X32_AVX-LABEL: fabs_v2f64:
	; X32_AVX: # BB#0:			; X32_AVX: # BB#0:
	; X32_AVX-NEXT: vandps {{\.LCPI.*}}, %xmm0, %xmm0			; X32_AVX-NEXT: vandps {{\.LCPI.*}}, %xmm0, %xmm0
	; X32_AVX-NEXT: retl			; X32_AVX-NEXT: retl
	;			;
	; X32_AVX512VL-LABEL: fabs_v2f64:			; X32_AVX512VL-LABEL: fabs_v2f64:
	; X32_AVX512VL: # BB#0:			; X32_AVX512VL: # BB#0:
	; X32_AVX512VL-NEXT: vpandq {{\.LCPI.*}}, %xmm0, %xmm0			; X32_AVX512VL-NEXT: vpand {{\.LCPI.*}}, %xmm0, %xmm0
	; X32_AVX512VL-NEXT: retl			; X32_AVX512VL-NEXT: retl
	;			;
	; X32_AVX512VLDQ-LABEL: fabs_v2f64:			; X32_AVX512VLDQ-LABEL: fabs_v2f64:
	; X32_AVX512VLDQ: # BB#0:			; X32_AVX512VLDQ: # BB#0:
	; X32_AVX512VLDQ-NEXT: vandps {{\.LCPI.*}}, %xmm0, %xmm0			; X32_AVX512VLDQ-NEXT: vandps {{\.LCPI.*}}, %xmm0, %xmm0
	; X32_AVX512VLDQ-NEXT: retl			; X32_AVX512VLDQ-NEXT: retl
	;			;
	; X64_AVX-LABEL: fabs_v2f64:			; X64_AVX-LABEL: fabs_v2f64:
	; X64_AVX: # BB#0:			; X64_AVX: # BB#0:
	; X64_AVX-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0			; X64_AVX-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0
	; X64_AVX-NEXT: retq			; X64_AVX-NEXT: retq
	;			;
	; X64_AVX512VL-LABEL: fabs_v2f64:			; X64_AVX512VL-LABEL: fabs_v2f64:
	; X64_AVX512VL: # BB#0:			; X64_AVX512VL: # BB#0:
	; X64_AVX512VL-NEXT: vpandq {{.*}}(%rip), %xmm0, %xmm0			; X64_AVX512VL-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0
	; X64_AVX512VL-NEXT: retq			; X64_AVX512VL-NEXT: retq
	;			;
	; X64_AVX512VLDQ-LABEL: fabs_v2f64:			; X64_AVX512VLDQ-LABEL: fabs_v2f64:
	; X64_AVX512VLDQ: # BB#0:			; X64_AVX512VLDQ: # BB#0:
	; X64_AVX512VLDQ-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0			; X64_AVX512VLDQ-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0
	; X64_AVX512VLDQ-NEXT: retq			; X64_AVX512VLDQ-NEXT: retq
	%t = call <2 x double> @llvm.fabs.v2f64(<2 x double> %p)			%t = call <2 x double> @llvm.fabs.v2f64(<2 x double> %p)
	ret <2 x double> %t			ret <2 x double> %t
	▲ Show 20 Lines • Show All 236 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vec_fp_to_int.ll

	Show First 20 Lines • Show All 2,462 Lines • ▼ Show 20 Lines
	; AVX512VL-NEXT: pushq %rbx			; AVX512VL-NEXT: pushq %rbx
	; AVX512VL-NEXT: subq $24, %rsp			; AVX512VL-NEXT: subq $24, %rsp
	; AVX512VL-NEXT: movq %rsi, %r14			; AVX512VL-NEXT: movq %rsi, %r14
	; AVX512VL-NEXT: movq %rdi, %rbx			; AVX512VL-NEXT: movq %rdi, %rbx
	; AVX512VL-NEXT: movq %rdx, %rdi			; AVX512VL-NEXT: movq %rdx, %rdi
	; AVX512VL-NEXT: movq %rcx, %rsi			; AVX512VL-NEXT: movq %rcx, %rsi
	; AVX512VL-NEXT: callq __fixtfdi			; AVX512VL-NEXT: callq __fixtfdi
	; AVX512VL-NEXT: vmovq %rax, %xmm0			; AVX512VL-NEXT: vmovq %rax, %xmm0
	; AVX512VL-NEXT: vmovdqa64 %xmm0, (%rsp) # 16-byte Spill			; AVX512VL-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
	; AVX512VL-NEXT: movq %rbx, %rdi			; AVX512VL-NEXT: movq %rbx, %rdi
	; AVX512VL-NEXT: movq %r14, %rsi			; AVX512VL-NEXT: movq %r14, %rsi
	; AVX512VL-NEXT: callq __fixtfdi			; AVX512VL-NEXT: callq __fixtfdi
	; AVX512VL-NEXT: vmovq %rax, %xmm0			; AVX512VL-NEXT: vmovq %rax, %xmm0
	; AVX512VL-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload			; AVX512VL-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
	; AVX512VL-NEXT: # xmm0 = xmm0[0],mem[0]			; AVX512VL-NEXT: # xmm0 = xmm0[0],mem[0]
	; AVX512VL-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero			; AVX512VL-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
	; AVX512VL-NEXT: addq $24, %rsp			; AVX512VL-NEXT: addq $24, %rsp
	Show All 31 Lines
	; AVX512VLDQ-NEXT: pushq %rbx			; AVX512VLDQ-NEXT: pushq %rbx
	; AVX512VLDQ-NEXT: subq $24, %rsp			; AVX512VLDQ-NEXT: subq $24, %rsp
	; AVX512VLDQ-NEXT: movq %rsi, %r14			; AVX512VLDQ-NEXT: movq %rsi, %r14
	; AVX512VLDQ-NEXT: movq %rdi, %rbx			; AVX512VLDQ-NEXT: movq %rdi, %rbx
	; AVX512VLDQ-NEXT: movq %rdx, %rdi			; AVX512VLDQ-NEXT: movq %rdx, %rdi
	; AVX512VLDQ-NEXT: movq %rcx, %rsi			; AVX512VLDQ-NEXT: movq %rcx, %rsi
	; AVX512VLDQ-NEXT: callq __fixtfdi			; AVX512VLDQ-NEXT: callq __fixtfdi
	; AVX512VLDQ-NEXT: vmovq %rax, %xmm0			; AVX512VLDQ-NEXT: vmovq %rax, %xmm0
	; AVX512VLDQ-NEXT: vmovdqa64 %xmm0, (%rsp) # 16-byte Spill			; AVX512VLDQ-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
	; AVX512VLDQ-NEXT: movq %rbx, %rdi			; AVX512VLDQ-NEXT: movq %rbx, %rdi
	; AVX512VLDQ-NEXT: movq %r14, %rsi			; AVX512VLDQ-NEXT: movq %r14, %rsi
	; AVX512VLDQ-NEXT: callq __fixtfdi			; AVX512VLDQ-NEXT: callq __fixtfdi
	; AVX512VLDQ-NEXT: vmovq %rax, %xmm0			; AVX512VLDQ-NEXT: vmovq %rax, %xmm0
	; AVX512VLDQ-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload			; AVX512VLDQ-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
	; AVX512VLDQ-NEXT: # xmm0 = xmm0[0],mem[0]			; AVX512VLDQ-NEXT: # xmm0 = xmm0[0],mem[0]
	; AVX512VLDQ-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero			; AVX512VLDQ-NEXT: vinsertps {{.*#+}} xmm0 = xmm0[0,2],zero,zero
	; AVX512VLDQ-NEXT: addq $24, %rsp			; AVX512VLDQ-NEXT: addq $24, %rsp
	; AVX512VLDQ-NEXT: popq %rbx			; AVX512VLDQ-NEXT: popq %rbx
	; AVX512VLDQ-NEXT: popq %r14			; AVX512VLDQ-NEXT: popq %r14
	; AVX512VLDQ-NEXT: retq			; AVX512VLDQ-NEXT: retq
	%cvt = fptosi <2 x fp128> %a to <2 x i32>			%cvt = fptosi <2 x fp128> %a to <2 x i32>
	%ext = shufflevector <2 x i32> %cvt, <2 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>			%ext = shufflevector <2 x i32> %cvt, <2 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
	ret <4 x i32> %ext			ret <4 x i32> %ext
	}			}

llvm/trunk/test/CodeGen/X86/vec_fpext.ll

	Show All 22 Lines
	; X32-AVX-NEXT: vcvtps2pd (%ecx), %xmm0 # encoding: [0xc5,0xf8,0x5a,0x01]			; X32-AVX-NEXT: vcvtps2pd (%ecx), %xmm0 # encoding: [0xc5,0xf8,0x5a,0x01]
	; X32-AVX-NEXT: vmovups %xmm0, (%eax) # encoding: [0xc5,0xf8,0x11,0x00]			; X32-AVX-NEXT: vmovups %xmm0, (%eax) # encoding: [0xc5,0xf8,0x11,0x00]
	; X32-AVX-NEXT: retl # encoding: [0xc3]			; X32-AVX-NEXT: retl # encoding: [0xc3]
	;			;
	; X32-AVX512VL-LABEL: fpext_frommem:			; X32-AVX512VL-LABEL: fpext_frommem:
	; X32-AVX512VL: # BB#0: # %entry			; X32-AVX512VL: # BB#0: # %entry
	; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x08]			; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x08]
	; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %ecx # encoding: [0x8b,0x4c,0x24,0x04]			; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %ecx # encoding: [0x8b,0x4c,0x24,0x04]
	; X32-AVX512VL-NEXT: vcvtps2pd (%ecx), %xmm0 # encoding: [0x62,0xf1,0x7c,0x08,0x5a,0x01]			; X32-AVX512VL-NEXT: vcvtps2pd (%ecx), %xmm0 # EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5a,0x01]
	; X32-AVX512VL-NEXT: vmovups %xmm0, (%eax) # encoding: [0x62,0xf1,0x7c,0x08,0x11,0x00]			; X32-AVX512VL-NEXT: vmovups %xmm0, (%eax) # EVEX TO VEX Compression encoding: [0xc5,0xf8,0x11,0x00]
	; X32-AVX512VL-NEXT: retl # encoding: [0xc3]			; X32-AVX512VL-NEXT: retl # encoding: [0xc3]
	;			;
	; X64-SSE-LABEL: fpext_frommem:			; X64-SSE-LABEL: fpext_frommem:
	; X64-SSE: # BB#0: # %entry			; X64-SSE: # BB#0: # %entry
	; X64-SSE-NEXT: cvtps2pd (%rdi), %xmm0 # encoding: [0x0f,0x5a,0x07]			; X64-SSE-NEXT: cvtps2pd (%rdi), %xmm0 # encoding: [0x0f,0x5a,0x07]
	; X64-SSE-NEXT: movups %xmm0, (%rsi) # encoding: [0x0f,0x11,0x06]			; X64-SSE-NEXT: movups %xmm0, (%rsi) # encoding: [0x0f,0x11,0x06]
	; X64-SSE-NEXT: retq # encoding: [0xc3]			; X64-SSE-NEXT: retq # encoding: [0xc3]
	;			;
	; X64-AVX-LABEL: fpext_frommem:			; X64-AVX-LABEL: fpext_frommem:
	; X64-AVX: # BB#0: # %entry			; X64-AVX: # BB#0: # %entry
	; X64-AVX-NEXT: vcvtps2pd (%rdi), %xmm0 # encoding: [0xc5,0xf8,0x5a,0x07]			; X64-AVX-NEXT: vcvtps2pd (%rdi), %xmm0 # encoding: [0xc5,0xf8,0x5a,0x07]
	; X64-AVX-NEXT: vmovups %xmm0, (%rsi) # encoding: [0xc5,0xf8,0x11,0x06]			; X64-AVX-NEXT: vmovups %xmm0, (%rsi) # encoding: [0xc5,0xf8,0x11,0x06]
	; X64-AVX-NEXT: retq # encoding: [0xc3]			; X64-AVX-NEXT: retq # encoding: [0xc3]
	;			;
	; X64-AVX512VL-LABEL: fpext_frommem:			; X64-AVX512VL-LABEL: fpext_frommem:
	; X64-AVX512VL: # BB#0: # %entry			; X64-AVX512VL: # BB#0: # %entry
	; X64-AVX512VL-NEXT: vcvtps2pd (%rdi), %xmm0 # encoding: [0x62,0xf1,0x7c,0x08,0x5a,0x07]			; X64-AVX512VL-NEXT: vcvtps2pd (%rdi), %xmm0 # EVEX TO VEX Compression encoding: [0xc5,0xf8,0x5a,0x07]
	; X64-AVX512VL-NEXT: vmovups %xmm0, (%rsi) # encoding: [0x62,0xf1,0x7c,0x08,0x11,0x06]			; X64-AVX512VL-NEXT: vmovups %xmm0, (%rsi) # EVEX TO VEX Compression encoding: [0xc5,0xf8,0x11,0x06]
	; X64-AVX512VL-NEXT: retq # encoding: [0xc3]			; X64-AVX512VL-NEXT: retq # encoding: [0xc3]
	entry:			entry:
	%0 = load <2 x float>, <2 x float>* %in, align 8			%0 = load <2 x float>, <2 x float>* %in, align 8
	%1 = fpext <2 x float> %0 to <2 x double>			%1 = fpext <2 x float> %0 to <2 x double>
	store <2 x double> %1, <2 x double>* %out, align 1			store <2 x double> %1, <2 x double>* %out, align 1
	ret void			ret void
	}			}

	Show All 16 Lines
	; X32-AVX-NEXT: vmovups %ymm0, (%eax) # encoding: [0xc5,0xfc,0x11,0x00]			; X32-AVX-NEXT: vmovups %ymm0, (%eax) # encoding: [0xc5,0xfc,0x11,0x00]
	; X32-AVX-NEXT: vzeroupper # encoding: [0xc5,0xf8,0x77]			; X32-AVX-NEXT: vzeroupper # encoding: [0xc5,0xf8,0x77]
	; X32-AVX-NEXT: retl # encoding: [0xc3]			; X32-AVX-NEXT: retl # encoding: [0xc3]
	;			;
	; X32-AVX512VL-LABEL: fpext_frommem4:			; X32-AVX512VL-LABEL: fpext_frommem4:
	; X32-AVX512VL: # BB#0: # %entry			; X32-AVX512VL: # BB#0: # %entry
	; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x08]			; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x08]
	; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %ecx # encoding: [0x8b,0x4c,0x24,0x04]			; X32-AVX512VL-NEXT: movl {{[0-9]+}}(%esp), %ecx # encoding: [0x8b,0x4c,0x24,0x04]
	; X32-AVX512VL-NEXT: vcvtps2pd (%ecx), %ymm0 # encoding: [0x62,0xf1,0x7c,0x28,0x5a,0x01]			; X32-AVX512VL-NEXT: vcvtps2pd (%ecx), %ymm0 # EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5a,0x01]
	; X32-AVX512VL-NEXT: vmovups %ymm0, (%eax) # encoding: [0x62,0xf1,0x7c,0x28,0x11,0x00]			; X32-AVX512VL-NEXT: vmovups %ymm0, (%eax) # EVEX TO VEX Compression encoding: [0xc5,0xfc,0x11,0x00]
	; X32-AVX512VL-NEXT: retl # encoding: [0xc3]			; X32-AVX512VL-NEXT: retl # encoding: [0xc3]
	;			;
	; X64-SSE-LABEL: fpext_frommem4:			; X64-SSE-LABEL: fpext_frommem4:
	; X64-SSE: # BB#0: # %entry			; X64-SSE: # BB#0: # %entry
	; X64-SSE-NEXT: cvtps2pd (%rdi), %xmm0 # encoding: [0x0f,0x5a,0x07]			; X64-SSE-NEXT: cvtps2pd (%rdi), %xmm0 # encoding: [0x0f,0x5a,0x07]
	; X64-SSE-NEXT: cvtps2pd 8(%rdi), %xmm1 # encoding: [0x0f,0x5a,0x4f,0x08]			; X64-SSE-NEXT: cvtps2pd 8(%rdi), %xmm1 # encoding: [0x0f,0x5a,0x4f,0x08]
	; X64-SSE-NEXT: movups %xmm1, 16(%rsi) # encoding: [0x0f,0x11,0x4e,0x10]			; X64-SSE-NEXT: movups %xmm1, 16(%rsi) # encoding: [0x0f,0x11,0x4e,0x10]
	; X64-SSE-NEXT: movups %xmm0, (%rsi) # encoding: [0x0f,0x11,0x06]			; X64-SSE-NEXT: movups %xmm0, (%rsi) # encoding: [0x0f,0x11,0x06]
	; X64-SSE-NEXT: retq # encoding: [0xc3]			; X64-SSE-NEXT: retq # encoding: [0xc3]
	;			;
	; X64-AVX-LABEL: fpext_frommem4:			; X64-AVX-LABEL: fpext_frommem4:
	; X64-AVX: # BB#0: # %entry			; X64-AVX: # BB#0: # %entry
	; X64-AVX-NEXT: vcvtps2pd (%rdi), %ymm0 # encoding: [0xc5,0xfc,0x5a,0x07]			; X64-AVX-NEXT: vcvtps2pd (%rdi), %ymm0 # encoding: [0xc5,0xfc,0x5a,0x07]
	; X64-AVX-NEXT: vmovups %ymm0, (%rsi) # encoding: [0xc5,0xfc,0x11,0x06]			; X64-AVX-NEXT: vmovups %ymm0, (%rsi) # encoding: [0xc5,0xfc,0x11,0x06]
	; X64-AVX-NEXT: vzeroupper # encoding: [0xc5,0xf8,0x77]			; X64-AVX-NEXT: vzeroupper # encoding: [0xc5,0xf8,0x77]
	; X64-AVX-NEXT: retq # encoding: [0xc3]			; X64-AVX-NEXT: retq # encoding: [0xc3]
	;			;
	; X64-AVX512VL-LABEL: fpext_frommem4:			; X64-AVX512VL-LABEL: fpext_frommem4:
	; X64-AVX512VL: # BB#0: # %entry			; X64-AVX512VL: # BB#0: # %entry
	; X64-AVX512VL-NEXT: vcvtps2pd (%rdi), %ymm0 # encoding: [0x62,0xf1,0x7c,0x28,0x5a,0x07]			; X64-AVX512VL-NEXT: vcvtps2pd (%rdi), %ymm0 # EVEX TO VEX Compression encoding: [0xc5,0xfc,0x5a,0x07]
	; X64-AVX512VL-NEXT: vmovups %ymm0, (%rsi) # encoding: [0x62,0xf1,0x7c,0x28,0x11,0x06]			; X64-AVX512VL-NEXT: vmovups %ymm0, (%rsi) # EVEX TO VEX Compression encoding: [0xc5,0xfc,0x11,0x06]
	; X64-AVX512VL-NEXT: retq # encoding: [0xc3]			; X64-AVX512VL-NEXT: retq # encoding: [0xc3]
	entry:			entry:
	%0 = load <4 x float>, <4 x float>* %in			%0 = load <4 x float>, <4 x float>* %in
	%1 = fpext <4 x float> %0 to <4 x double>			%1 = fpext <4 x float> %0 to <4 x double>
	store <4 x double> %1, <4 x double>* %out, align 1			store <4 x double> %1, <4 x double>* %out, align 1
	ret void			ret void
	}			}

	▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
	; X32-AVX: # BB#0: # %entry			; X32-AVX: # BB#0: # %entry
	; X32-AVX-NEXT: vmovaps {{.*#+}} xmm0 = [1.000000e+00,-2.000000e+00]			; X32-AVX-NEXT: vmovaps {{.*#+}} xmm0 = [1.000000e+00,-2.000000e+00]
	; X32-AVX-NEXT: # encoding: [0xc5,0xf8,0x28,0x05,A,A,A,A]			; X32-AVX-NEXT: # encoding: [0xc5,0xf8,0x28,0x05,A,A,A,A]
	; X32-AVX-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: FK_Data_4			; X32-AVX-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: FK_Data_4
	; X32-AVX-NEXT: retl # encoding: [0xc3]			; X32-AVX-NEXT: retl # encoding: [0xc3]
	;			;
	; X32-AVX512VL-LABEL: fpext_fromconst:			; X32-AVX512VL-LABEL: fpext_fromconst:
	; X32-AVX512VL: # BB#0: # %entry			; X32-AVX512VL: # BB#0: # %entry
	; X32-AVX512VL-NEXT: vmovaps {{.*#+}} xmm0 = [1.000000e+00,-2.000000e+00]			; X32-AVX512VL-NEXT: vmovaps {{\.LCPI.*}}, %xmm0 # EVEX TO VEX Compression xmm0 = [1.000000e+00,-2.000000e+00]
	; X32-AVX512VL-NEXT: # encoding: [0x62,0xf1,0x7c,0x08,0x28,0x05,A,A,A,A]			; X32-AVX512VL-NEXT: # encoding: [0xc5,0xf8,0x28,0x05,A,A,A,A]
	; X32-AVX512VL-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}, kind: FK_Data_4			; X32-AVX512VL-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}, kind: FK_Data_4
	; X32-AVX512VL-NEXT: retl # encoding: [0xc3]			; X32-AVX512VL-NEXT: retl # encoding: [0xc3]
	;			;
	; X64-SSE-LABEL: fpext_fromconst:			; X64-SSE-LABEL: fpext_fromconst:
	; X64-SSE: # BB#0: # %entry			; X64-SSE: # BB#0: # %entry
	; X64-SSE-NEXT: movaps {{.*#+}} xmm0 = [1.000000e+00,-2.000000e+00]			; X64-SSE-NEXT: movaps {{.*#+}} xmm0 = [1.000000e+00,-2.000000e+00]
	; X64-SSE-NEXT: # encoding: [0x0f,0x28,0x05,A,A,A,A]			; X64-SSE-NEXT: # encoding: [0x0f,0x28,0x05,A,A,A,A]
	; X64-SSE-NEXT: # fixup A - offset: 3, value: {{\.LCPI.*}}-4, kind: reloc_riprel_4byte			; X64-SSE-NEXT: # fixup A - offset: 3, value: {{\.LCPI.*}}-4, kind: reloc_riprel_4byte
	; X64-SSE-NEXT: retq # encoding: [0xc3]			; X64-SSE-NEXT: retq # encoding: [0xc3]
	;			;
	; X64-AVX-LABEL: fpext_fromconst:			; X64-AVX-LABEL: fpext_fromconst:
	; X64-AVX: # BB#0: # %entry			; X64-AVX: # BB#0: # %entry
	; X64-AVX-NEXT: vmovaps {{.*#+}} xmm0 = [1.000000e+00,-2.000000e+00]			; X64-AVX-NEXT: vmovaps {{.*#+}} xmm0 = [1.000000e+00,-2.000000e+00]
	; X64-AVX-NEXT: # encoding: [0xc5,0xf8,0x28,0x05,A,A,A,A]			; X64-AVX-NEXT: # encoding: [0xc5,0xf8,0x28,0x05,A,A,A,A]
	; X64-AVX-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}-4, kind: reloc_riprel_4byte			; X64-AVX-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}-4, kind: reloc_riprel_4byte
	; X64-AVX-NEXT: retq # encoding: [0xc3]			; X64-AVX-NEXT: retq # encoding: [0xc3]
	;			;
	; X64-AVX512VL-LABEL: fpext_fromconst:			; X64-AVX512VL-LABEL: fpext_fromconst:
	; X64-AVX512VL: # BB#0: # %entry			; X64-AVX512VL: # BB#0: # %entry
	; X64-AVX512VL-NEXT: vmovaps {{.*#+}} xmm0 = [1.000000e+00,-2.000000e+00]			; X64-AVX512VL-NEXT: vmovaps {{.*}}(%rip), %xmm0 # EVEX TO VEX Compression xmm0 = [1.000000e+00,-2.000000e+00]
	; X64-AVX512VL-NEXT: # encoding: [0x62,0xf1,0x7c,0x08,0x28,0x05,A,A,A,A]			; X64-AVX512VL-NEXT: # encoding: [0xc5,0xf8,0x28,0x05,A,A,A,A]
	; X64-AVX512VL-NEXT: # fixup A - offset: 6, value: {{\.LCPI.*}}-4, kind: reloc_riprel_4byte			; X64-AVX512VL-NEXT: # fixup A - offset: 4, value: {{\.LCPI.*}}-4, kind: reloc_riprel_4byte
	; X64-AVX512VL-NEXT: retq # encoding: [0xc3]			; X64-AVX512VL-NEXT: retq # encoding: [0xc3]
	entry:			entry:
	%0 = insertelement <2 x float> undef, float 1.0, i32 0			%0 = insertelement <2 x float> undef, float 1.0, i32 0
	%1 = insertelement <2 x float> %0, float -2.0, i32 1			%1 = insertelement <2 x float> %0, float -2.0, i32 1
	%2 = fpext <2 x float> %1 to <2 x double>			%2 = fpext <2 x float> %1 to <2 x double>
	ret <2 x double> %2			ret <2 x double> %2
	}			}

llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll

	Show First 20 Lines • Show All 2,588 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vcvtsi2sdq %rax, %xmm1, %xmm1			; AVX512F-NEXT: vcvtsi2sdq %rax, %xmm1, %xmm1
	; AVX512F-NEXT: vmovq %xmm0, %rax			; AVX512F-NEXT: vmovq %xmm0, %rax
	; AVX512F-NEXT: vcvtsi2sdq %rax, %xmm2, %xmm0			; AVX512F-NEXT: vcvtsi2sdq %rax, %xmm2, %xmm0
	; AVX512F-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512F-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: sitofp_load_2i64_to_2f64:			; AVX512VL-LABEL: sitofp_load_2i64_to_2f64:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa64 (%rdi), %xmm0			; AVX512VL-NEXT: vmovdqa (%rdi), %xmm0
	; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax			; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax
	; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm1, %xmm1			; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm1, %xmm1
	; AVX512VL-NEXT: vmovq %xmm0, %rax			; AVX512VL-NEXT: vmovq %xmm0, %rax
	; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm2, %xmm0			; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm2, %xmm0
	; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512DQ-LABEL: sitofp_load_2i64_to_2f64:			; AVX512DQ-LABEL: sitofp_load_2i64_to_2f64:
	▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vmovq %xmm0, %rax			; AVX512F-NEXT: vmovq %xmm0, %rax
	; AVX512F-NEXT: vcvtsi2sdq %rax, %xmm3, %xmm0			; AVX512F-NEXT: vcvtsi2sdq %rax, %xmm3, %xmm0
	; AVX512F-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm2[0]			; AVX512F-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm2[0]
	; AVX512F-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0			; AVX512F-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: sitofp_load_4i64_to_4f64:			; AVX512VL-LABEL: sitofp_load_4i64_to_4f64:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa64 (%rdi), %ymm0			; AVX512VL-NEXT: vmovdqa (%rdi), %ymm0
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VL-NEXT: vpextrq $1, %xmm1, %rax			; AVX512VL-NEXT: vpextrq $1, %xmm1, %rax
	; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm2, %xmm2			; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm2, %xmm2
	; AVX512VL-NEXT: vmovq %xmm1, %rax			; AVX512VL-NEXT: vmovq %xmm1, %rax
	; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm3, %xmm1			; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm3, %xmm1
	; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm1 = xmm1[0],xmm2[0]			; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm1 = xmm1[0],xmm2[0]
	; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax			; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax
	; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm3, %xmm2			; AVX512VL-NEXT: vcvtsi2sdq %rax, %xmm3, %xmm2
	▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vcvtusi2sdq %rax, %xmm1, %xmm1			; AVX512F-NEXT: vcvtusi2sdq %rax, %xmm1, %xmm1
	; AVX512F-NEXT: vmovq %xmm0, %rax			; AVX512F-NEXT: vmovq %xmm0, %rax
	; AVX512F-NEXT: vcvtusi2sdq %rax, %xmm2, %xmm0			; AVX512F-NEXT: vcvtusi2sdq %rax, %xmm2, %xmm0
	; AVX512F-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512F-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: uitofp_load_2i64_to_2f64:			; AVX512VL-LABEL: uitofp_load_2i64_to_2f64:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa64 (%rdi), %xmm0			; AVX512VL-NEXT: vmovdqa (%rdi), %xmm0
	; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax			; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax
	; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm1, %xmm1			; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm1, %xmm1
	; AVX512VL-NEXT: vmovq %xmm0, %rax			; AVX512VL-NEXT: vmovq %xmm0, %rax
	; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm2, %xmm0			; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm2, %xmm0
	; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512DQ-LABEL: uitofp_load_2i64_to_2f64:			; AVX512DQ-LABEL: uitofp_load_2i64_to_2f64:
	▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vpmovzxwd {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero			; AVX512F-NEXT: vpmovzxwd {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
	; AVX512F-NEXT: vcvtdq2pd %xmm0, %xmm0			; AVX512F-NEXT: vcvtdq2pd %xmm0, %xmm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: uitofp_load_2i16_to_2f64:			; AVX512VL-LABEL: uitofp_load_2i16_to_2f64:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpmovzxwq {{.*#+}} xmm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero			; AVX512VL-NEXT: vpmovzxwq {{.*#+}} xmm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero
	; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]			; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3,4,5,6,7]			; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3,4,5,6,7]
	; AVX512VL-NEXT: vcvtdq2pd %xmm0, %xmm0			; AVX512VL-NEXT: vcvtdq2pd %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512DQ-LABEL: uitofp_load_2i16_to_2f64:			; AVX512DQ-LABEL: uitofp_load_2i16_to_2f64:
	; AVX512DQ: # BB#0:			; AVX512DQ: # BB#0:
	; AVX512DQ-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero			; AVX512DQ-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; AVX512DQ-NEXT: vpmovzxwd {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero			; AVX512DQ-NEXT: vpmovzxwd {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
	; AVX512DQ-NEXT: vcvtdq2pd %xmm0, %xmm0			; AVX512DQ-NEXT: vcvtdq2pd %xmm0, %xmm0
	; AVX512DQ-NEXT: retq			; AVX512DQ-NEXT: retq
	;			;
	; AVX512VLDQ-LABEL: uitofp_load_2i16_to_2f64:			; AVX512VLDQ-LABEL: uitofp_load_2i16_to_2f64:
	; AVX512VLDQ: # BB#0:			; AVX512VLDQ: # BB#0:
	; AVX512VLDQ-NEXT: vpmovzxwq {{.*#+}} xmm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero			; AVX512VLDQ-NEXT: vpmovzxwq {{.*#+}} xmm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero
	; AVX512VLDQ-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]			; AVX512VLDQ-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
	; AVX512VLDQ-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VLDQ-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VLDQ-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3,4,5,6,7]			; AVX512VLDQ-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2],xmm1[3,4,5,6,7]
	; AVX512VLDQ-NEXT: vcvtdq2pd %xmm0, %xmm0			; AVX512VLDQ-NEXT: vcvtdq2pd %xmm0, %xmm0
	; AVX512VLDQ-NEXT: retq			; AVX512VLDQ-NEXT: retq
	%ld = load <2 x i16>, <2 x i16> *%a			%ld = load <2 x i16>, <2 x i16> *%a
	%cvt = uitofp <2 x i16> %ld to <2 x double>			%cvt = uitofp <2 x i16> %ld to <2 x double>
	ret <2 x double> %cvt			ret <2 x double> %cvt
	}			}

	▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vmovq %xmm0, %rax			; AVX512F-NEXT: vmovq %xmm0, %rax
	; AVX512F-NEXT: vcvtusi2sdq %rax, %xmm3, %xmm0			; AVX512F-NEXT: vcvtusi2sdq %rax, %xmm3, %xmm0
	; AVX512F-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm2[0]			; AVX512F-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm0[0],xmm2[0]
	; AVX512F-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0			; AVX512F-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: uitofp_load_4i64_to_4f64:			; AVX512VL-LABEL: uitofp_load_4i64_to_4f64:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa64 (%rdi), %ymm0			; AVX512VL-NEXT: vmovdqa (%rdi), %ymm0
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VL-NEXT: vpextrq $1, %xmm1, %rax			; AVX512VL-NEXT: vpextrq $1, %xmm1, %rax
	; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm2, %xmm2			; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm2, %xmm2
	; AVX512VL-NEXT: vmovq %xmm1, %rax			; AVX512VL-NEXT: vmovq %xmm1, %rax
	; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm3, %xmm1			; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm3, %xmm1
	; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm1 = xmm1[0],xmm2[0]			; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm1 = xmm1[0],xmm2[0]
	; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax			; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax
	; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm3, %xmm2			; AVX512VL-NEXT: vcvtusi2sdq %rax, %xmm3, %xmm2
	▲ Show 20 Lines • Show All 214 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,1],xmm2[0],xmm1[3]			; AVX512F-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,1],xmm2[0],xmm1[3]
	; AVX512F-NEXT: vpextrq $1, %xmm0, %rax			; AVX512F-NEXT: vpextrq $1, %xmm0, %rax
	; AVX512F-NEXT: vcvtsi2ssq %rax, %xmm3, %xmm0			; AVX512F-NEXT: vcvtsi2ssq %rax, %xmm3, %xmm0
	; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm1[0,1,2],xmm0[0]			; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm1[0,1,2],xmm0[0]
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: sitofp_load_4i64_to_4f32:			; AVX512VL-LABEL: sitofp_load_4i64_to_4f32:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa64 (%rdi), %ymm0			; AVX512VL-NEXT: vmovdqa (%rdi), %ymm0
	; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax			; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax
	; AVX512VL-NEXT: vcvtsi2ssq %rax, %xmm1, %xmm1			; AVX512VL-NEXT: vcvtsi2ssq %rax, %xmm1, %xmm1
	; AVX512VL-NEXT: vmovq %xmm0, %rax			; AVX512VL-NEXT: vmovq %xmm0, %rax
	; AVX512VL-NEXT: vcvtsi2ssq %rax, %xmm2, %xmm2			; AVX512VL-NEXT: vcvtsi2ssq %rax, %xmm2, %xmm2
	; AVX512VL-NEXT: vinsertps {{.*#+}} xmm1 = xmm2[0],xmm1[0],xmm2[2,3]			; AVX512VL-NEXT: vinsertps {{.*#+}} xmm1 = xmm2[0],xmm1[0],xmm2[2,3]
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm0			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm0
	; AVX512VL-NEXT: vmovq %xmm0, %rax			; AVX512VL-NEXT: vmovq %xmm0, %rax
	; AVX512VL-NEXT: vcvtsi2ssq %rax, %xmm3, %xmm2			; AVX512VL-NEXT: vcvtsi2ssq %rax, %xmm3, %xmm2
	▲ Show 20 Lines • Show All 570 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,1],xmm2[0],xmm1[3]			; AVX512F-NEXT: vinsertps {{.*#+}} xmm1 = xmm1[0,1],xmm2[0],xmm1[3]
	; AVX512F-NEXT: vpextrq $1, %xmm0, %rax			; AVX512F-NEXT: vpextrq $1, %xmm0, %rax
	; AVX512F-NEXT: vcvtusi2ssq %rax, %xmm3, %xmm0			; AVX512F-NEXT: vcvtusi2ssq %rax, %xmm3, %xmm0
	; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm1[0,1,2],xmm0[0]			; AVX512F-NEXT: vinsertps {{.*#+}} xmm0 = xmm1[0,1,2],xmm0[0]
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: uitofp_load_4i64_to_4f32:			; AVX512VL-LABEL: uitofp_load_4i64_to_4f32:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa64 (%rdi), %ymm0			; AVX512VL-NEXT: vmovdqa (%rdi), %ymm0
	; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax			; AVX512VL-NEXT: vpextrq $1, %xmm0, %rax
	; AVX512VL-NEXT: vcvtusi2ssq %rax, %xmm1, %xmm1			; AVX512VL-NEXT: vcvtusi2ssq %rax, %xmm1, %xmm1
	; AVX512VL-NEXT: vmovq %xmm0, %rax			; AVX512VL-NEXT: vmovq %xmm0, %rax
	; AVX512VL-NEXT: vcvtusi2ssq %rax, %xmm2, %xmm2			; AVX512VL-NEXT: vcvtusi2ssq %rax, %xmm2, %xmm2
	; AVX512VL-NEXT: vinsertps {{.*#+}} xmm1 = xmm2[0],xmm1[0],xmm2[2,3]			; AVX512VL-NEXT: vinsertps {{.*#+}} xmm1 = xmm2[0],xmm1[0],xmm2[2,3]
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm0			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm0
	; AVX512VL-NEXT: vmovq %xmm0, %rax			; AVX512VL-NEXT: vmovq %xmm0, %rax
	; AVX512VL-NEXT: vcvtusi2ssq %rax, %xmm3, %xmm2			; AVX512VL-NEXT: vcvtusi2ssq %rax, %xmm3, %xmm2
	▲ Show 20 Lines • Show All 802 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-half-conversions.ll

	Show First 20 Lines • Show All 3,004 Lines • ▼ Show 20 Lines
	; AVX512VL-NEXT: movzwl %dx, %edx			; AVX512VL-NEXT: movzwl %dx, %edx
	; AVX512VL-NEXT: orl %eax, %edx			; AVX512VL-NEXT: orl %eax, %edx
	; AVX512VL-NEXT: shlq $32, %rdx			; AVX512VL-NEXT: shlq $32, %rdx
	; AVX512VL-NEXT: orq %rcx, %rdx			; AVX512VL-NEXT: orq %rcx, %rdx
	; AVX512VL-NEXT: vmovq %rdx, %xmm0			; AVX512VL-NEXT: vmovq %rdx, %xmm0
	; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]			; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]
	; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]			; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]
	; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,2]			; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,2]
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpunpckhqdq {{.*#+}} xmm0 = xmm0[1],xmm1[1]			; AVX512VL-NEXT: vpunpckhqdq {{.*#+}} xmm0 = xmm0[1],xmm1[1]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%1 = fptrunc <4 x float> %a0 to <4 x half>			%1 = fptrunc <4 x float> %a0 to <4 x half>
	%2 = bitcast <4 x half> %1 to <4 x i16>			%2 = bitcast <4 x half> %1 to <4 x i16>
	%3 = shufflevector <4 x i16> %2, <4 x i16> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%3 = shufflevector <4 x i16> %2, <4 x i16> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	ret <8 x i16> %3			ret <8 x i16> %3
	}			}

	▲ Show 20 Lines • Show All 686 Lines • ▼ Show 20 Lines
	; AVX512VL-NEXT: movzwl %dx, %edx			; AVX512VL-NEXT: movzwl %dx, %edx
	; AVX512VL-NEXT: orl %eax, %edx			; AVX512VL-NEXT: orl %eax, %edx
	; AVX512VL-NEXT: shlq $32, %rdx			; AVX512VL-NEXT: shlq $32, %rdx
	; AVX512VL-NEXT: orq %rcx, %rdx			; AVX512VL-NEXT: orq %rcx, %rdx
	; AVX512VL-NEXT: vmovq %rdx, %xmm0			; AVX512VL-NEXT: vmovq %rdx, %xmm0
	; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]			; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]
	; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]			; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]
	; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]			; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
	; AVX512VL-NEXT: vmovdqa32 %xmm0, (%rdi)			; AVX512VL-NEXT: vmovdqa %xmm0, (%rdi)
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%1 = fptrunc <4 x float> %a0 to <4 x half>			%1 = fptrunc <4 x float> %a0 to <4 x half>
	%2 = bitcast <4 x half> %1 to <4 x i16>			%2 = bitcast <4 x half> %1 to <4 x i16>
	%3 = shufflevector <4 x i16> %2, <4 x i16> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%3 = shufflevector <4 x i16> %2, <4 x i16> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	store <8 x i16> %3, <8 x i16>* %a1			store <8 x i16> %3, <8 x i16>* %a1
	ret void			ret void
	}			}

	▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines
	; AVX512VL-NEXT: movzwl %dx, %edx			; AVX512VL-NEXT: movzwl %dx, %edx
	; AVX512VL-NEXT: orl %eax, %edx			; AVX512VL-NEXT: orl %eax, %edx
	; AVX512VL-NEXT: shlq $32, %rdx			; AVX512VL-NEXT: shlq $32, %rdx
	; AVX512VL-NEXT: orq %rcx, %rdx			; AVX512VL-NEXT: orq %rcx, %rdx
	; AVX512VL-NEXT: vmovq %rdx, %xmm0			; AVX512VL-NEXT: vmovq %rdx, %xmm0
	; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]			; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]
	; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]			; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]
	; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,2]			; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,2]
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpunpckhqdq {{.*#+}} xmm0 = xmm0[1],xmm1[1]			; AVX512VL-NEXT: vpunpckhqdq {{.*#+}} xmm0 = xmm0[1],xmm1[1]
	; AVX512VL-NEXT: vmovdqa32 %xmm0, (%rdi)			; AVX512VL-NEXT: vmovdqa %xmm0, (%rdi)
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%1 = fptrunc <4 x float> %a0 to <4 x half>			%1 = fptrunc <4 x float> %a0 to <4 x half>
	%2 = bitcast <4 x half> %1 to <4 x i16>			%2 = bitcast <4 x half> %1 to <4 x i16>
	%3 = shufflevector <4 x i16> %2, <4 x i16> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%3 = shufflevector <4 x i16> %2, <4 x i16> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	store <8 x i16> %3, <8 x i16>* %a1			store <8 x i16> %3, <8 x i16>* %a1
	ret void			ret void
	}			}

	▲ Show 20 Lines • Show All 896 Lines • ▼ Show 20 Lines
	; AVX512VL-NEXT: movzwl %ax, %eax			; AVX512VL-NEXT: movzwl %ax, %eax
	; AVX512VL-NEXT: orl %ebx, %eax			; AVX512VL-NEXT: orl %ebx, %eax
	; AVX512VL-NEXT: shlq $32, %rax			; AVX512VL-NEXT: shlq $32, %rax
	; AVX512VL-NEXT: orq %r14, %rax			; AVX512VL-NEXT: orq %r14, %rax
	; AVX512VL-NEXT: vmovq %rax, %xmm0			; AVX512VL-NEXT: vmovq %rax, %xmm0
	; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]			; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]
	; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]			; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]
	; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,2]			; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,2]
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpunpckhqdq {{.*#+}} xmm0 = xmm0[1],xmm1[1]			; AVX512VL-NEXT: vpunpckhqdq {{.*#+}} xmm0 = xmm0[1],xmm1[1]
	; AVX512VL-NEXT: addq $40, %rsp			; AVX512VL-NEXT: addq $40, %rsp
	; AVX512VL-NEXT: popq %rbx			; AVX512VL-NEXT: popq %rbx
	; AVX512VL-NEXT: popq %r14			; AVX512VL-NEXT: popq %r14
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%1 = fptrunc <4 x double> %a0 to <4 x half>			%1 = fptrunc <4 x double> %a0 to <4 x half>
	%2 = bitcast <4 x half> %1 to <4 x i16>			%2 = bitcast <4 x half> %1 to <4 x i16>
	%3 = shufflevector <4 x i16> %2, <4 x i16> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%3 = shufflevector <4 x i16> %2, <4 x i16> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	▲ Show 20 Lines • Show All 614 Lines • ▼ Show 20 Lines
	; AVX512VL-NEXT: movzwl %ax, %eax			; AVX512VL-NEXT: movzwl %ax, %eax
	; AVX512VL-NEXT: orl %ebp, %eax			; AVX512VL-NEXT: orl %ebp, %eax
	; AVX512VL-NEXT: shlq $32, %rax			; AVX512VL-NEXT: shlq $32, %rax
	; AVX512VL-NEXT: orq %rbx, %rax			; AVX512VL-NEXT: orq %rbx, %rax
	; AVX512VL-NEXT: vmovq %rax, %xmm0			; AVX512VL-NEXT: vmovq %rax, %xmm0
	; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]			; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]
	; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]			; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]
	; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]			; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
	; AVX512VL-NEXT: vmovdqa32 %xmm0, (%r14)			; AVX512VL-NEXT: vmovdqa %xmm0, (%r14)
	; AVX512VL-NEXT: addq $32, %rsp			; AVX512VL-NEXT: addq $32, %rsp
	; AVX512VL-NEXT: popq %rbx			; AVX512VL-NEXT: popq %rbx
	; AVX512VL-NEXT: popq %r14			; AVX512VL-NEXT: popq %r14
	; AVX512VL-NEXT: popq %rbp			; AVX512VL-NEXT: popq %rbp
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%1 = fptrunc <4 x double> %a0 to <4 x half>			%1 = fptrunc <4 x double> %a0 to <4 x half>
	%2 = bitcast <4 x half> %1 to <4 x i16>			%2 = bitcast <4 x half> %1 to <4 x i16>
	%3 = shufflevector <4 x i16> %2, <4 x i16> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%3 = shufflevector <4 x i16> %2, <4 x i16> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	▲ Show 20 Lines • Show All 154 Lines • ▼ Show 20 Lines
	; AVX512VL-NEXT: movzwl %ax, %eax			; AVX512VL-NEXT: movzwl %ax, %eax
	; AVX512VL-NEXT: orl %ebp, %eax			; AVX512VL-NEXT: orl %ebp, %eax
	; AVX512VL-NEXT: shlq $32, %rax			; AVX512VL-NEXT: shlq $32, %rax
	; AVX512VL-NEXT: orq %rbx, %rax			; AVX512VL-NEXT: orq %rbx, %rax
	; AVX512VL-NEXT: vmovq %rax, %xmm0			; AVX512VL-NEXT: vmovq %rax, %xmm0
	; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]			; AVX512VL-NEXT: vpshuflw {{.*#+}} xmm0 = xmm0[0,2,2,3,4,5,6,7]
	; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]			; AVX512VL-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,6,6,7]
	; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,2]			; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,2]
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpunpckhqdq {{.*#+}} xmm0 = xmm0[1],xmm1[1]			; AVX512VL-NEXT: vpunpckhqdq {{.*#+}} xmm0 = xmm0[1],xmm1[1]
	; AVX512VL-NEXT: vmovdqa32 %xmm0, (%r14)			; AVX512VL-NEXT: vmovdqa %xmm0, (%r14)
	; AVX512VL-NEXT: addq $32, %rsp			; AVX512VL-NEXT: addq $32, %rsp
	; AVX512VL-NEXT: popq %rbx			; AVX512VL-NEXT: popq %rbx
	; AVX512VL-NEXT: popq %r14			; AVX512VL-NEXT: popq %r14
	; AVX512VL-NEXT: popq %rbp			; AVX512VL-NEXT: popq %rbp
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%1 = fptrunc <4 x double> %a0 to <4 x half>			%1 = fptrunc <4 x double> %a0 to <4 x half>
	%2 = bitcast <4 x half> %1 to <4 x i16>			%2 = bitcast <4 x half> %1 to <4 x i16>
	%3 = shufflevector <4 x i16> %2, <4 x i16> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%3 = shufflevector <4 x i16> %2, <4 x i16> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	▲ Show 20 Lines • Show All 273 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-lzcnt-256.ll

	Show First 20 Lines • Show All 710 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VLCD-LABEL: testv32i8:			; AVX512VLCD-LABEL: testv32i8:
	; AVX512VLCD: ## BB#0:			; AVX512VLCD: ## BB#0:
	; AVX512VLCD-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VLCD-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VLCD-NEXT: vpmovzxbd {{.*#+}} zmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero,xmm1[4],zero,zero,zero,xmm1[5],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[7],zero,zero,zero,xmm1[8],zero,zero,zero,xmm1[9],zero,zero,zero,xmm1[10],zero,zero,zero,xmm1[11],zero,zero,zero,xmm1[12],zero,zero,zero,xmm1[13],zero,zero,zero,xmm1[14],zero,zero,zero,xmm1[15],zero,zero,zero			; AVX512VLCD-NEXT: vpmovzxbd {{.*#+}} zmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero,xmm1[4],zero,zero,zero,xmm1[5],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[7],zero,zero,zero,xmm1[8],zero,zero,zero,xmm1[9],zero,zero,zero,xmm1[10],zero,zero,zero,xmm1[11],zero,zero,zero,xmm1[12],zero,zero,zero,xmm1[13],zero,zero,zero,xmm1[14],zero,zero,zero,xmm1[15],zero,zero,zero
	; AVX512VLCD-NEXT: vplzcntd %zmm1, %zmm1			; AVX512VLCD-NEXT: vplzcntd %zmm1, %zmm1
	; AVX512VLCD-NEXT: vpmovdb %zmm1, %xmm1			; AVX512VLCD-NEXT: vpmovdb %zmm1, %xmm1
	; AVX512VLCD-NEXT: vmovdqa64 {{.*#+}} xmm2 = [24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24]			; AVX512VLCD-NEXT: vmovdqa {{.*#+}} xmm2 = [24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24]
	; AVX512VLCD-NEXT: vpsubb %xmm2, %xmm1, %xmm1			; AVX512VLCD-NEXT: vpsubb %xmm2, %xmm1, %xmm1
	; AVX512VLCD-NEXT: vpmovzxbd {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero			; AVX512VLCD-NEXT: vpmovzxbd {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero
	; AVX512VLCD-NEXT: vplzcntd %zmm0, %zmm0			; AVX512VLCD-NEXT: vplzcntd %zmm0, %zmm0
	; AVX512VLCD-NEXT: vpmovdb %zmm0, %xmm0			; AVX512VLCD-NEXT: vpmovdb %zmm0, %xmm0
	; AVX512VLCD-NEXT: vpsubb %xmm2, %xmm0, %xmm0			; AVX512VLCD-NEXT: vpsubb %xmm2, %xmm0, %xmm0
	; AVX512VLCD-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm0			; AVX512VLCD-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm0
	; AVX512VLCD-NEXT: retq			; AVX512VLCD-NEXT: retq
	;			;
	▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VLCD-LABEL: testv32i8u:			; AVX512VLCD-LABEL: testv32i8u:
	; AVX512VLCD: ## BB#0:			; AVX512VLCD: ## BB#0:
	; AVX512VLCD-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VLCD-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VLCD-NEXT: vpmovzxbd {{.*#+}} zmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero,xmm1[4],zero,zero,zero,xmm1[5],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[7],zero,zero,zero,xmm1[8],zero,zero,zero,xmm1[9],zero,zero,zero,xmm1[10],zero,zero,zero,xmm1[11],zero,zero,zero,xmm1[12],zero,zero,zero,xmm1[13],zero,zero,zero,xmm1[14],zero,zero,zero,xmm1[15],zero,zero,zero			; AVX512VLCD-NEXT: vpmovzxbd {{.*#+}} zmm1 = xmm1[0],zero,zero,zero,xmm1[1],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[3],zero,zero,zero,xmm1[4],zero,zero,zero,xmm1[5],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[7],zero,zero,zero,xmm1[8],zero,zero,zero,xmm1[9],zero,zero,zero,xmm1[10],zero,zero,zero,xmm1[11],zero,zero,zero,xmm1[12],zero,zero,zero,xmm1[13],zero,zero,zero,xmm1[14],zero,zero,zero,xmm1[15],zero,zero,zero
	; AVX512VLCD-NEXT: vplzcntd %zmm1, %zmm1			; AVX512VLCD-NEXT: vplzcntd %zmm1, %zmm1
	; AVX512VLCD-NEXT: vpmovdb %zmm1, %xmm1			; AVX512VLCD-NEXT: vpmovdb %zmm1, %xmm1
	; AVX512VLCD-NEXT: vmovdqa64 {{.*#+}} xmm2 = [24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24]			; AVX512VLCD-NEXT: vmovdqa {{.*#+}} xmm2 = [24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24]
	; AVX512VLCD-NEXT: vpsubb %xmm2, %xmm1, %xmm1			; AVX512VLCD-NEXT: vpsubb %xmm2, %xmm1, %xmm1
	; AVX512VLCD-NEXT: vpmovzxbd {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero			; AVX512VLCD-NEXT: vpmovzxbd {{.*#+}} zmm0 = xmm0[0],zero,zero,zero,xmm0[1],zero,zero,zero,xmm0[2],zero,zero,zero,xmm0[3],zero,zero,zero,xmm0[4],zero,zero,zero,xmm0[5],zero,zero,zero,xmm0[6],zero,zero,zero,xmm0[7],zero,zero,zero,xmm0[8],zero,zero,zero,xmm0[9],zero,zero,zero,xmm0[10],zero,zero,zero,xmm0[11],zero,zero,zero,xmm0[12],zero,zero,zero,xmm0[13],zero,zero,zero,xmm0[14],zero,zero,zero,xmm0[15],zero,zero,zero
	; AVX512VLCD-NEXT: vplzcntd %zmm0, %zmm0			; AVX512VLCD-NEXT: vplzcntd %zmm0, %zmm0
	; AVX512VLCD-NEXT: vpmovdb %zmm0, %xmm0			; AVX512VLCD-NEXT: vpmovdb %zmm0, %xmm0
	; AVX512VLCD-NEXT: vpsubb %xmm2, %xmm0, %xmm0			; AVX512VLCD-NEXT: vpsubb %xmm2, %xmm0, %xmm0
	; AVX512VLCD-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm0			; AVX512VLCD-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm0
	; AVX512VLCD-NEXT: retq			; AVX512VLCD-NEXT: retq
	;			;
	▲ Show 20 Lines • Show All 237 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-128-v16.ll

	Show First 20 Lines • Show All 417 Lines • ▼ Show 20 Lines
	; AVX1OR2-LABEL: shuffle_v16i8_00_17_02_19_04_21_06_23_08_25_10_27_12_29_14_31:			; AVX1OR2-LABEL: shuffle_v16i8_00_17_02_19_04_21_06_23_08_25_10_27_12_29_14_31:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX1OR2-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0			; AVX1OR2-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_00_17_02_19_04_21_06_23_08_25_10_27_12_29_14_31:			; AVX512VL-LABEL: shuffle_v16i8_00_17_02_19_04_21_06_23_08_25_10_27_12_29_14_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX512VL-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0			; AVX512VL-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>			%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @shuffle_v16i8_00_01_02_19_04_05_06_23_08_09_10_27_12_13_14_31(<16 x i8> %a, <16 x i8> %b) {			define <16 x i8> @shuffle_v16i8_00_01_02_19_04_05_06_23_08_09_10_27_12_13_14_31(<16 x i8> %a, <16 x i8> %b) {
	; SSE2-LABEL: shuffle_v16i8_00_01_02_19_04_05_06_23_08_09_10_27_12_13_14_31:			; SSE2-LABEL: shuffle_v16i8_00_01_02_19_04_05_06_23_08_09_10_27_12_13_14_31:
	Show All 22 Lines
	; AVX1OR2-LABEL: shuffle_v16i8_00_01_02_19_04_05_06_23_08_09_10_27_12_13_14_31:			; AVX1OR2-LABEL: shuffle_v16i8_00_01_02_19_04_05_06_23_08_09_10_27_12_13_14_31:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = [255,255,255,0,255,255,255,0,255,255,255,0,255,255,255,0]			; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = [255,255,255,0,255,255,255,0,255,255,255,0,255,255,255,0]
	; AVX1OR2-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0			; AVX1OR2-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_00_01_02_19_04_05_06_23_08_09_10_27_12_13_14_31:			; AVX512VL-LABEL: shuffle_v16i8_00_01_02_19_04_05_06_23_08_09_10_27_12_13_14_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [255,255,255,0,255,255,255,0,255,255,255,0,255,255,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [255,255,255,0,255,255,255,0,255,255,255,0,255,255,255,0]
	; AVX512VL-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0			; AVX512VL-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 19, i32 4, i32 5, i32 6, i32 23, i32 8, i32 9, i32 10, i32 27, i32 12, i32 13, i32 14, i32 31>			%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 19, i32 4, i32 5, i32 6, i32 23, i32 8, i32 9, i32 10, i32 27, i32 12, i32 13, i32 14, i32 31>
	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @shuffle_v16i8_00_01_02_zz_04_05_06_zz_08_09_10_zz_12_13_14_zz(<16 x i8> %a) {			define <16 x i8> @shuffle_v16i8_00_01_02_zz_04_05_06_zz_08_09_10_zz_12_13_14_zz(<16 x i8> %a) {
	; SSE-LABEL: shuffle_v16i8_00_01_02_zz_04_05_06_zz_08_09_10_zz_12_13_14_zz:			; SSE-LABEL: shuffle_v16i8_00_01_02_zz_04_05_06_zz_08_09_10_zz_12_13_14_zz:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: andps {{.*}}(%rip), %xmm0			; SSE-NEXT: andps {{.*}}(%rip), %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX1OR2-LABEL: shuffle_v16i8_00_01_02_zz_04_05_06_zz_08_09_10_zz_12_13_14_zz:			; AVX1OR2-LABEL: shuffle_v16i8_00_01_02_zz_04_05_06_zz_08_09_10_zz_12_13_14_zz:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0			; AVX1OR2-NEXT: vandps {{.*}}(%rip), %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_00_01_02_zz_04_05_06_zz_08_09_10_zz_12_13_14_zz:			; AVX512VL-LABEL: shuffle_v16i8_00_01_02_zz_04_05_06_zz_08_09_10_zz_12_13_14_zz:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpandq {{.*}}(%rip), %xmm0, %xmm0			; AVX512VL-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i8> %a, <16 x i8> zeroinitializer, <16 x i32> <i32 0, i32 1, i32 2, i32 19, i32 4, i32 5, i32 6, i32 23, i32 8, i32 9, i32 10, i32 27, i32 12, i32 13, i32 14, i32 31>			%shuffle = shufflevector <16 x i8> %a, <16 x i8> zeroinitializer, <16 x i32> <i32 0, i32 1, i32 2, i32 19, i32 4, i32 5, i32 6, i32 23, i32 8, i32 9, i32 10, i32 27, i32 12, i32 13, i32 14, i32 31>
	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @shuffle_v16i8_00_01_02_03_20_05_06_23_08_09_10_11_28_13_14_31(<16 x i8> %a, <16 x i8> %b) {			define <16 x i8> @shuffle_v16i8_00_01_02_03_20_05_06_23_08_09_10_11_28_13_14_31(<16 x i8> %a, <16 x i8> %b) {
	; SSE2-LABEL: shuffle_v16i8_00_01_02_03_20_05_06_23_08_09_10_11_28_13_14_31:			; SSE2-LABEL: shuffle_v16i8_00_01_02_03_20_05_06_23_08_09_10_11_28_13_14_31:
	; SSE2: # BB#0:			; SSE2: # BB#0:
	Show All 21 Lines
	; AVX1OR2-LABEL: shuffle_v16i8_00_01_02_03_20_05_06_23_08_09_10_11_28_13_14_31:			; AVX1OR2-LABEL: shuffle_v16i8_00_01_02_03_20_05_06_23_08_09_10_11_28_13_14_31:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = [255,255,255,255,0,255,255,0,255,255,255,255,0,255,255,0]			; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = [255,255,255,255,0,255,255,0,255,255,255,255,0,255,255,0]
	; AVX1OR2-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0			; AVX1OR2-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_00_01_02_03_20_05_06_23_08_09_10_11_28_13_14_31:			; AVX512VL-LABEL: shuffle_v16i8_00_01_02_03_20_05_06_23_08_09_10_11_28_13_14_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [255,255,255,255,0,255,255,0,255,255,255,255,0,255,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [255,255,255,255,0,255,255,0,255,255,255,255,0,255,255,0]
	; AVX512VL-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0			; AVX512VL-NEXT: vpblendvb %xmm2, %xmm0, %xmm1, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 20, i32 5, i32 6, i32 23, i32 8, i32 9, i32 10, i32 11, i32 28, i32 13, i32 14, i32 31>			%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 20, i32 5, i32 6, i32 23, i32 8, i32 9, i32 10, i32 11, i32 28, i32 13, i32 14, i32 31>
	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @shuffle_v16i8_16_17_18_19_04_05_06_07_24_25_10_11_28_13_30_15(<16 x i8> %a, <16 x i8> %b) {			define <16 x i8> @shuffle_v16i8_16_17_18_19_04_05_06_07_24_25_10_11_28_13_30_15(<16 x i8> %a, <16 x i8> %b) {
	; SSE2-LABEL: shuffle_v16i8_16_17_18_19_04_05_06_07_24_25_10_11_28_13_30_15:			; SSE2-LABEL: shuffle_v16i8_16_17_18_19_04_05_06_07_24_25_10_11_28_13_30_15:
	Show All 23 Lines
	; AVX1OR2-LABEL: shuffle_v16i8_16_17_18_19_04_05_06_07_24_25_10_11_28_13_30_15:			; AVX1OR2-LABEL: shuffle_v16i8_16_17_18_19_04_05_06_07_24_25_10_11_28_13_30_15:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = [255,255,255,255,0,0,0,0,255,255,0,0,255,0,255,0]			; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = [255,255,255,255,0,0,0,0,255,255,0,0,255,0,255,0]
	; AVX1OR2-NEXT: vpblendvb %xmm2, %xmm1, %xmm0, %xmm0			; AVX1OR2-NEXT: vpblendvb %xmm2, %xmm1, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_16_17_18_19_04_05_06_07_24_25_10_11_28_13_30_15:			; AVX512VL-LABEL: shuffle_v16i8_16_17_18_19_04_05_06_07_24_25_10_11_28_13_30_15:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [255,255,255,255,0,0,0,0,255,255,0,0,255,0,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [255,255,255,255,0,0,0,0,255,255,0,0,255,0,255,0]
	; AVX512VL-NEXT: vpblendvb %xmm2, %xmm1, %xmm0, %xmm0			; AVX512VL-NEXT: vpblendvb %xmm2, %xmm1, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 4, i32 5, i32 6, i32 7, i32 24, i32 25, i32 10, i32 11, i32 28, i32 13, i32 30, i32 15>			%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 4, i32 5, i32 6, i32 7, i32 24, i32 25, i32 10, i32 11, i32 28, i32 13, i32 30, i32 15>
	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @trunc_v4i32_shuffle(<16 x i8> %a) {			define <16 x i8> @trunc_v4i32_shuffle(<16 x i8> %a) {
	; SSE2-LABEL: trunc_v4i32_shuffle:			; SSE2-LABEL: trunc_v4i32_shuffle:
	▲ Show 20 Lines • Show All 133 Lines • ▼ Show 20 Lines
	; AVX1OR2-LABEL: shuffle_v16i8_zz_zz_zz_zz_zz_16_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz:			; AVX1OR2-LABEL: shuffle_v16i8_zz_zz_zz_zz_zz_16_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrb $5, %edi, %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrb $5, %edi, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_zz_zz_zz_zz_zz_16_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz:			; AVX512VL-LABEL: shuffle_v16i8_zz_zz_zz_zz_zz_16_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrb $5, %edi, %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrb $5, %edi, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%a = insertelement <16 x i8> undef, i8 %i, i32 0			%a = insertelement <16 x i8> undef, i8 %i, i32 0
	%shuffle = shufflevector <16 x i8> zeroinitializer, <16 x i8> %a, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 16, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <16 x i8> zeroinitializer, <16 x i8> %a, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 16, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @shuffle_v16i8_zz_uu_uu_zz_uu_uu_zz_zz_zz_zz_zz_zz_zz_zz_zz_16(i8 %i) {			define <16 x i8> @shuffle_v16i8_zz_uu_uu_zz_uu_uu_zz_zz_zz_zz_zz_zz_zz_zz_zz_16(i8 %i) {
	Show All 20 Lines
	; AVX1OR2-LABEL: shuffle_v16i8_zz_uu_uu_zz_uu_uu_zz_zz_zz_zz_zz_zz_zz_zz_zz_16:			; AVX1OR2-LABEL: shuffle_v16i8_zz_uu_uu_zz_uu_uu_zz_zz_zz_zz_zz_zz_zz_zz_zz_16:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrb $15, %edi, %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrb $15, %edi, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_zz_uu_uu_zz_uu_uu_zz_zz_zz_zz_zz_zz_zz_zz_zz_16:			; AVX512VL-LABEL: shuffle_v16i8_zz_uu_uu_zz_uu_uu_zz_zz_zz_zz_zz_zz_zz_zz_zz_16:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrb $15, %edi, %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrb $15, %edi, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%a = insertelement <16 x i8> undef, i8 %i, i32 0			%a = insertelement <16 x i8> undef, i8 %i, i32 0
	%shuffle = shufflevector <16 x i8> zeroinitializer, <16 x i8> %a, <16 x i32> <i32 0, i32 undef, i32 undef, i32 3, i32 undef, i32 undef, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 16>			%shuffle = shufflevector <16 x i8> zeroinitializer, <16 x i8> %a, <16 x i32> <i32 0, i32 undef, i32 undef, i32 3, i32 undef, i32 undef, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 16>
	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @shuffle_v16i8_zz_zz_19_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz(i8 %i) {			define <16 x i8> @shuffle_v16i8_zz_zz_19_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz(i8 %i) {
	Show All 20 Lines
	; AVX1OR2-LABEL: shuffle_v16i8_zz_zz_19_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz:			; AVX1OR2-LABEL: shuffle_v16i8_zz_zz_19_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrb $2, %edi, %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrb $2, %edi, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_zz_zz_19_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz:			; AVX512VL-LABEL: shuffle_v16i8_zz_zz_19_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrb $2, %edi, %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrb $2, %edi, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%a = insertelement <16 x i8> undef, i8 %i, i32 3			%a = insertelement <16 x i8> undef, i8 %i, i32 3
	%shuffle = shufflevector <16 x i8> zeroinitializer, <16 x i8> %a, <16 x i32> <i32 0, i32 1, i32 19, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>			%shuffle = shufflevector <16 x i8> zeroinitializer, <16 x i8> %a, <16 x i32> <i32 0, i32 1, i32 19, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @shuffle_v16i8_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_16_uu_18_uu(<16 x i8> %a) {			define <16 x i8> @shuffle_v16i8_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_zz_16_uu_18_uu(<16 x i8> %a) {
	▲ Show 20 Lines • Show All 432 Lines • ▼ Show 20 Lines
	; AVX1OR2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[u,10,2,7],zero,xmm0[14,7,2],zero,xmm0[3,1,14],zero,xmm0[9,11,0]			; AVX1OR2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[u,10,2,7],zero,xmm0[14,7,2],zero,xmm0[3,1,14],zero,xmm0[9,11,0]
	; AVX1OR2-NEXT: vpor %xmm1, %xmm0, %xmm0			; AVX1OR2-NEXT: vpor %xmm1, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i8_uu_10_02_07_22_14_07_02_18_03_01_14_18_09_11_00:			; AVX512VL-LABEL: shuffle_v16i8_uu_10_02_07_22_14_07_02_18_03_01_14_18_09_11_00:
	; AVX512VL: # BB#0: # %entry			; AVX512VL: # BB#0: # %entry
	; AVX512VL-NEXT: vpshufb {{.*#+}} xmm1 = xmm1[u],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[2],zero,zero,zero			; AVX512VL-NEXT: vpshufb {{.*#+}} xmm1 = xmm1[u],zero,zero,zero,xmm1[6],zero,zero,zero,xmm1[2],zero,zero,zero,xmm1[2],zero,zero,zero
	; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[u,10,2,7],zero,xmm0[14,7,2],zero,xmm0[3,1,14],zero,xmm0[9,11,0]			; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[u,10,2,7],zero,xmm0[14,7,2],zero,xmm0[3,1,14],zero,xmm0[9,11,0]
	; AVX512VL-NEXT: vporq %xmm1, %xmm0, %xmm0			; AVX512VL-NEXT: vpor %xmm1, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	entry:			entry:
	%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 undef, i32 10, i32 2, i32 7, i32 22, i32 14, i32 7, i32 2, i32 18, i32 3, i32 1, i32 14, i32 18, i32 9, i32 11, i32 0>			%shuffle = shufflevector <16 x i8> %a, <16 x i8> %b, <16 x i32> <i32 undef, i32 10, i32 2, i32 7, i32 22, i32 14, i32 7, i32 2, i32 18, i32 3, i32 1, i32 14, i32 18, i32 9, i32 11, i32 0>

	ret <16 x i8> %shuffle			ret <16 x i8> %shuffle
	}			}

	define <16 x i8> @stress_test2(<16 x i8> %s.0.0, <16 x i8> %s.0.1, <16 x i8> %s.0.2) {			define <16 x i8> @stress_test2(<16 x i8> %s.0.0, <16 x i8> %s.0.1, <16 x i8> %s.0.2) {
	Show All 20 Lines
	; AVX1OR2: # BB#0: # %entry			; AVX1OR2: # BB#0: # %entry
	; AVX1OR2-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vmovaps %xmm0, (%rdi)			; AVX1OR2-NEXT: vmovaps %xmm0, (%rdi)
	; AVX1OR2-NEXT: vmovaps %xmm0, (%rsi)			; AVX1OR2-NEXT: vmovaps %xmm0, (%rsi)
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: constant_gets_selected:			; AVX512VL-LABEL: constant_gets_selected:
	; AVX512VL: # BB#0: # %entry			; AVX512VL: # BB#0: # %entry
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vmovdqa32 %xmm0, (%rdi)			; AVX512VL-NEXT: vmovdqa %xmm0, (%rdi)
	; AVX512VL-NEXT: vmovdqa32 %xmm0, (%rsi)			; AVX512VL-NEXT: vmovdqa %xmm0, (%rsi)
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	entry:			entry:
	%weird_zero = bitcast <4 x i32> zeroinitializer to <16 x i8>			%weird_zero = bitcast <4 x i32> zeroinitializer to <16 x i8>
	%shuffle.i = shufflevector <16 x i8> <i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 0, i8 0, i8 0, i8 0>, <16 x i8> %weird_zero, <16 x i32> <i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27>			%shuffle.i = shufflevector <16 x i8> <i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 0, i8 0, i8 0, i8 0>, <16 x i8> %weird_zero, <16 x i32> <i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27>
	%weirder_zero = bitcast <16 x i8> %shuffle.i to <4 x i32>			%weirder_zero = bitcast <16 x i8> %shuffle.i to <4 x i32>
	store <4 x i32> %weirder_zero, <4 x i32>* %ptr1, align 16			store <4 x i32> %weirder_zero, <4 x i32>* %ptr1, align 16
	store <4 x i32> zeroinitializer, <4 x i32>* %ptr2, align 16			store <4 x i32> zeroinitializer, <4 x i32>* %ptr2, align 16
	ret void			ret void
	▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines
	; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>			; AVX1OR2-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>
	; AVX1OR2-NEXT: vpshufb %xmm2, %xmm1, %xmm1			; AVX1OR2-NEXT: vpshufb %xmm2, %xmm1, %xmm1
	; AVX1OR2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX1OR2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX1OR2-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: PR12412:			; AVX512VL-LABEL: PR12412:
	; AVX512VL: # BB#0: # %entry			; AVX512VL: # BB#0: # %entry
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm1			; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm1
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512VL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512VL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	entry:			entry:
	%0 = shufflevector <16 x i8> %inval1, <16 x i8> %inval2, <16 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14, i32 16, i32 18, i32 20, i32 22, i32 24, i32 26, i32 28, i32 30>			%0 = shufflevector <16 x i8> %inval1, <16 x i8> %inval2, <16 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14, i32 16, i32 18, i32 20, i32 22, i32 24, i32 26, i32 28, i32 30>
	ret <16 x i8> %0			ret <16 x i8> %0
	}			}
	▲ Show 20 Lines • Show All 344 Lines • ▼ Show 20 Lines
	; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrb $0, (%rdi), %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrb $0, (%rdi), %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrb $1, (%rsi), %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrb $1, (%rsi), %xmm0, %xmm0
	; AVX1OR2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[1,1,1,1,1,1,1],zero,xmm0[1,1,1,1,1,0,0,0]			; AVX1OR2-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[1,1,1,1,1,1,1],zero,xmm0[1,1,1,1,1,0,0,0]
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: PR31364:			; AVX512VL-LABEL: PR31364:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrb $0, (%rdi), %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrb $0, (%rdi), %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrb $1, (%rsi), %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrb $1, (%rsi), %xmm0, %xmm0
	; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[1,1,1,1,1,1,1],zero,xmm0[1,1,1,1,1,0,0,0]			; AVX512VL-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[1,1,1,1,1,1,1],zero,xmm0[1,1,1,1,1,0,0,0]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%v0 = load i8, i8* %a, align 1			%v0 = load i8, i8* %a, align 1
	%vecins = insertelement <16 x i8> <i8 undef, i8 undef, i8 undef, i8 0, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef>, i8 %v0, i32 0			%vecins = insertelement <16 x i8> <i8 undef, i8 undef, i8 undef, i8 0, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef>, i8 %v0, i32 0
	%v1 = load i8, i8* %b, align 1			%v1 = load i8, i8* %b, align 1
	%vecins2 = insertelement <16 x i8> %vecins, i8 %v1, i32 1			%vecins2 = insertelement <16 x i8> %vecins, i8 %v1, i32 1
	▲ Show 20 Lines • Show All 72 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-128-v2.ll

	Show First 20 Lines • Show All 776 Lines • ▼ Show 20 Lines
	; AVX2-LABEL: shuffle_v2i64_z1:			; AVX2-LABEL: shuffle_v2i64_z1:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0,1],xmm0[2,3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0,1],xmm0[2,3]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v2i64_z1:			; AVX512VL-LABEL: shuffle_v2i64_z1:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0,1],xmm0[2,3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0,1],xmm0[2,3]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <2 x i64> %a, <2 x i64> zeroinitializer, <2 x i32> <i32 2, i32 1>			%shuffle = shufflevector <2 x i64> %a, <2 x i64> zeroinitializer, <2 x i32> <i32 2, i32 1>
	ret <2 x i64> %shuffle			ret <2 x i64> %shuffle
	}			}

	define <2 x double> @shuffle_v2f64_0z(<2 x double> %a) {			define <2 x double> @shuffle_v2f64_0z(<2 x double> %a) {
	; SSE-LABEL: shuffle_v2f64_0z:			; SSE-LABEL: shuffle_v2f64_0z:
	Show All 25 Lines
	; AVX2-LABEL: shuffle_v2f64_1z:			; AVX2-LABEL: shuffle_v2f64_1z:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vunpckhpd {{.*#+}} xmm0 = xmm0[1],xmm1[1]			; AVX2-NEXT: vunpckhpd {{.*#+}} xmm0 = xmm0[1],xmm1[1]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v2f64_1z:			; AVX512VL-LABEL: shuffle_v2f64_1z:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vunpckhpd {{.*#+}} xmm0 = xmm0[1],xmm1[1]			; AVX512VL-NEXT: vunpckhpd {{.*#+}} xmm0 = xmm0[1],xmm1[1]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <2 x double> %a, <2 x double> zeroinitializer, <2 x i32> <i32 1, i32 3>			%shuffle = shufflevector <2 x double> %a, <2 x double> zeroinitializer, <2 x i32> <i32 1, i32 3>
	ret <2 x double> %shuffle			ret <2 x double> %shuffle
	}			}

	define <2 x double> @shuffle_v2f64_z0(<2 x double> %a) {			define <2 x double> @shuffle_v2f64_z0(<2 x double> %a) {
	; SSE-LABEL: shuffle_v2f64_z0:			; SSE-LABEL: shuffle_v2f64_z0:
	Show All 12 Lines
	; AVX2-LABEL: shuffle_v2f64_z0:			; AVX2-LABEL: shuffle_v2f64_z0:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm1[0],xmm0[0]			; AVX2-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm1[0],xmm0[0]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v2f64_z0:			; AVX512VL-LABEL: shuffle_v2f64_z0:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm1[0],xmm0[0]			; AVX512VL-NEXT: vunpcklpd {{.*#+}} xmm0 = xmm1[0],xmm0[0]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <2 x double> %a, <2 x double> zeroinitializer, <2 x i32> <i32 2, i32 0>			%shuffle = shufflevector <2 x double> %a, <2 x double> zeroinitializer, <2 x i32> <i32 2, i32 0>
	ret <2 x double> %shuffle			ret <2 x double> %shuffle
	}			}

	define <2 x double> @shuffle_v2f64_z1(<2 x double> %a) {			define <2 x double> @shuffle_v2f64_z1(<2 x double> %a) {
	; SSE2-LABEL: shuffle_v2f64_z1:			; SSE2-LABEL: shuffle_v2f64_z1:
	Show All 29 Lines
	; AVX2-LABEL: shuffle_v2f64_z1:			; AVX2-LABEL: shuffle_v2f64_z1:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vblendpd {{.*#+}} xmm0 = xmm1[0],xmm0[1]			; AVX2-NEXT: vblendpd {{.*#+}} xmm0 = xmm1[0],xmm0[1]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v2f64_z1:			; AVX512VL-LABEL: shuffle_v2f64_z1:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vblendpd {{.*#+}} xmm0 = xmm1[0],xmm0[1]			; AVX512VL-NEXT: vblendpd {{.*#+}} xmm0 = xmm1[0],xmm0[1]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <2 x double> %a, <2 x double> zeroinitializer, <2 x i32> <i32 2, i32 1>			%shuffle = shufflevector <2 x double> %a, <2 x double> zeroinitializer, <2 x i32> <i32 2, i32 1>
	ret <2 x double> %shuffle			ret <2 x double> %shuffle
	}			}

	define <2 x double> @shuffle_v2f64_bitcast_1z(<2 x double> %a) {			define <2 x double> @shuffle_v2f64_bitcast_1z(<2 x double> %a) {
	; SSE-LABEL: shuffle_v2f64_bitcast_1z:			; SSE-LABEL: shuffle_v2f64_bitcast_1z:
	Show All 11 Lines
	; AVX2-LABEL: shuffle_v2f64_bitcast_1z:			; AVX2-LABEL: shuffle_v2f64_bitcast_1z:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vxorpd %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vshufpd {{.*#+}} xmm0 = xmm0[1],xmm1[0]			; AVX2-NEXT: vshufpd {{.*#+}} xmm0 = xmm0[1],xmm1[0]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v2f64_bitcast_1z:			; AVX512VL-LABEL: shuffle_v2f64_bitcast_1z:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vshufpd {{.*#+}} xmm0 = xmm0[1],xmm1[0]			; AVX512VL-NEXT: vshufpd {{.*#+}} xmm0 = xmm0[1],xmm1[0]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle64 = shufflevector <2 x double> %a, <2 x double> zeroinitializer, <2 x i32> <i32 2, i32 1>			%shuffle64 = shufflevector <2 x double> %a, <2 x double> zeroinitializer, <2 x i32> <i32 2, i32 1>
	%bitcast32 = bitcast <2 x double> %shuffle64 to <4 x float>			%bitcast32 = bitcast <2 x double> %shuffle64 to <4 x float>
	%shuffle32 = shufflevector <4 x float> %bitcast32, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 0, i32 1>			%shuffle32 = shufflevector <4 x float> %bitcast32, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 0, i32 1>
	%bitcast64 = bitcast <4 x float> %shuffle32 to <2 x double>			%bitcast64 = bitcast <4 x float> %shuffle32 to <2 x double>
	ret <2 x double> %bitcast64			ret <2 x double> %bitcast64
	}			}
	Show All 29 Lines
	; AVX2-LABEL: shuffle_v2i64_bitcast_z123:			; AVX2-LABEL: shuffle_v2i64_bitcast_z123:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0],xmm0[1,2,3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0],xmm0[1,2,3]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v2i64_bitcast_z123:			; AVX512VL-LABEL: shuffle_v2i64_bitcast_z123:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0],xmm0[1,2,3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0],xmm0[1,2,3]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%bitcast32 = bitcast <2 x i64> %x to <4 x float>			%bitcast32 = bitcast <2 x i64> %x to <4 x float>
	%shuffle32 = shufflevector <4 x float> %bitcast32, <4 x float> <float 1.000000e+00, float undef, float undef, float undef>, <4 x i32> <i32 4, i32 1, i32 2, i32 3>			%shuffle32 = shufflevector <4 x float> %bitcast32, <4 x float> <float 1.000000e+00, float undef, float undef, float undef>, <4 x i32> <i32 4, i32 1, i32 2, i32 3>
	%bitcast64 = bitcast <4 x float> %shuffle32 to <2 x i64>			%bitcast64 = bitcast <4 x float> %shuffle32 to <2 x i64>
	%and = and <2 x i64> %bitcast64, <i64 -4294967296, i64 -1>			%and = and <2 x i64> %bitcast64, <i64 -4294967296, i64 -1>
	ret <2 x i64> %and			ret <2 x i64> %and
	}			}
	▲ Show 20 Lines • Show All 419 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-128-v4.ll

	Show First 20 Lines • Show All 1,358 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]			; AVX2-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]
	; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0],xmm0[1],xmm1[2,3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0],xmm0[1],xmm1[2,3]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v4i32_z6zz:			; AVX512VL-LABEL: shuffle_v4i32_z6zz:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]			; AVX512VL-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,2,3,3]
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0],xmm0[1],xmm1[2,3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm1[0],xmm0[1],xmm1[2,3]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <4 x i32> zeroinitializer, <4 x i32> %a, <4 x i32> <i32 0, i32 6, i32 2, i32 3>			%shuffle = shufflevector <4 x i32> zeroinitializer, <4 x i32> %a, <4 x i32> <i32 0, i32 6, i32 2, i32 3>
	ret <4 x i32> %shuffle			ret <4 x i32> %shuffle
	}			}

	define <4 x i32> @shuffle_v4i32_7012(<4 x i32> %a, <4 x i32> %b) {			define <4 x i32> @shuffle_v4i32_7012(<4 x i32> %a, <4 x i32> %b) {
	; SSE2-LABEL: shuffle_v4i32_7012:			; SSE2-LABEL: shuffle_v4i32_7012:
	▲ Show 20 Lines • Show All 310 Lines • ▼ Show 20 Lines
	; AVX2-LABEL: shuffle_v4i32_0z23:			; AVX2-LABEL: shuffle_v4i32_0z23:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v4i32_0z23:			; AVX512VL-LABEL: shuffle_v4i32_0z23:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 4, i32 2, i32 3>			%shuffle = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 4, i32 2, i32 3>
	ret <4 x i32> %shuffle			ret <4 x i32> %shuffle
	}			}

	define <4 x i32> @shuffle_v4i32_01z3(<4 x i32> %a) {			define <4 x i32> @shuffle_v4i32_01z3(<4 x i32> %a) {
	; SSE2-LABEL: shuffle_v4i32_01z3:			; SSE2-LABEL: shuffle_v4i32_01z3:
	Show All 26 Lines
	; AVX2-LABEL: shuffle_v4i32_01z3:			; AVX2-LABEL: shuffle_v4i32_01z3:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v4i32_01z3:			; AVX512VL-LABEL: shuffle_v4i32_01z3:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 3>			%shuffle = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 4, i32 3>
	ret <4 x i32> %shuffle			ret <4 x i32> %shuffle
	}			}

	define <4 x i32> @shuffle_v4i32_012z(<4 x i32> %a) {			define <4 x i32> @shuffle_v4i32_012z(<4 x i32> %a) {
	; SSE2-LABEL: shuffle_v4i32_012z:			; SSE2-LABEL: shuffle_v4i32_012z:
	Show All 26 Lines
	; AVX2-LABEL: shuffle_v4i32_012z:			; AVX2-LABEL: shuffle_v4i32_012z:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v4i32_012z:			; AVX512VL-LABEL: shuffle_v4i32_012z:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 7>			%shuffle = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 7>
	ret <4 x i32> %shuffle			ret <4 x i32> %shuffle
	}			}

	define <4 x i32> @shuffle_v4i32_0zz3(<4 x i32> %a) {			define <4 x i32> @shuffle_v4i32_0zz3(<4 x i32> %a) {
	; SSE2-LABEL: shuffle_v4i32_0zz3:			; SSE2-LABEL: shuffle_v4i32_0zz3:
	Show All 26 Lines
	; AVX2-LABEL: shuffle_v4i32_0zz3:			; AVX2-LABEL: shuffle_v4i32_0zz3:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1,2],xmm0[3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1,2],xmm0[3]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v4i32_0zz3:			; AVX512VL-LABEL: shuffle_v4i32_0zz3:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1,2],xmm0[3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1,2],xmm0[3]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 4, i32 4, i32 3>			%shuffle = shufflevector <4 x i32> %a, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 4, i32 4, i32 3>
	ret <4 x i32> %shuffle			ret <4 x i32> %shuffle
	}			}

	define <4 x i32> @shuffle_v4i32_bitcast_0415(<4 x i32> %a, <4 x i32> %b) {			define <4 x i32> @shuffle_v4i32_bitcast_0415(<4 x i32> %a, <4 x i32> %b) {
	; SSE-LABEL: shuffle_v4i32_bitcast_0415:			; SSE-LABEL: shuffle_v4i32_bitcast_0415:
	▲ Show 20 Lines • Show All 584 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-128-v8.ll

	Show First 20 Lines • Show All 1,417 Lines • ▼ Show 20 Lines
	; AVX1OR2-LABEL: shuffle_v8i16_z8zzzzzz:			; AVX1OR2-LABEL: shuffle_v8i16_z8zzzzzz:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrw $1, %edi, %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrw $1, %edi, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i16_z8zzzzzz:			; AVX512VL-LABEL: shuffle_v8i16_z8zzzzzz:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrw $1, %edi, %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrw $1, %edi, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%a = insertelement <8 x i16> undef, i16 %i, i32 0			%a = insertelement <8 x i16> undef, i16 %i, i32 0
	%shuffle = shufflevector <8 x i16> zeroinitializer, <8 x i16> %a, <8 x i32> <i32 2, i32 8, i32 3, i32 7, i32 6, i32 5, i32 4, i32 3>			%shuffle = shufflevector <8 x i16> zeroinitializer, <8 x i16> %a, <8 x i32> <i32 2, i32 8, i32 3, i32 7, i32 6, i32 5, i32 4, i32 3>
	ret <8 x i16> %shuffle			ret <8 x i16> %shuffle
	}			}

	define <8 x i16> @shuffle_v8i16_zzzzz8zz(i16 %i) {			define <8 x i16> @shuffle_v8i16_zzzzz8zz(i16 %i) {
	; SSE-LABEL: shuffle_v8i16_zzzzz8zz:			; SSE-LABEL: shuffle_v8i16_zzzzz8zz:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: pxor %xmm0, %xmm0			; SSE-NEXT: pxor %xmm0, %xmm0
	; SSE-NEXT: pinsrw $5, %edi, %xmm0			; SSE-NEXT: pinsrw $5, %edi, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX1OR2-LABEL: shuffle_v8i16_zzzzz8zz:			; AVX1OR2-LABEL: shuffle_v8i16_zzzzz8zz:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrw $5, %edi, %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrw $5, %edi, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i16_zzzzz8zz:			; AVX512VL-LABEL: shuffle_v8i16_zzzzz8zz:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrw $5, %edi, %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrw $5, %edi, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%a = insertelement <8 x i16> undef, i16 %i, i32 0			%a = insertelement <8 x i16> undef, i16 %i, i32 0
	%shuffle = shufflevector <8 x i16> zeroinitializer, <8 x i16> %a, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 8, i32 0, i32 0>			%shuffle = shufflevector <8 x i16> zeroinitializer, <8 x i16> %a, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 8, i32 0, i32 0>
	ret <8 x i16> %shuffle			ret <8 x i16> %shuffle
	}			}

	define <8 x i16> @shuffle_v8i16_zuuzuuz8(i16 %i) {			define <8 x i16> @shuffle_v8i16_zuuzuuz8(i16 %i) {
	; SSE-LABEL: shuffle_v8i16_zuuzuuz8:			; SSE-LABEL: shuffle_v8i16_zuuzuuz8:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: pxor %xmm0, %xmm0			; SSE-NEXT: pxor %xmm0, %xmm0
	; SSE-NEXT: pinsrw $7, %edi, %xmm0			; SSE-NEXT: pinsrw $7, %edi, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX1OR2-LABEL: shuffle_v8i16_zuuzuuz8:			; AVX1OR2-LABEL: shuffle_v8i16_zuuzuuz8:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrw $7, %edi, %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrw $7, %edi, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i16_zuuzuuz8:			; AVX512VL-LABEL: shuffle_v8i16_zuuzuuz8:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrw $7, %edi, %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrw $7, %edi, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%a = insertelement <8 x i16> undef, i16 %i, i32 0			%a = insertelement <8 x i16> undef, i16 %i, i32 0
	%shuffle = shufflevector <8 x i16> zeroinitializer, <8 x i16> %a, <8 x i32> <i32 0, i32 undef, i32 undef, i32 3, i32 undef, i32 undef, i32 6, i32 8>			%shuffle = shufflevector <8 x i16> zeroinitializer, <8 x i16> %a, <8 x i32> <i32 0, i32 undef, i32 undef, i32 3, i32 undef, i32 undef, i32 6, i32 8>
	ret <8 x i16> %shuffle			ret <8 x i16> %shuffle
	}			}

	define <8 x i16> @shuffle_v8i16_zzBzzzzz(i16 %i) {			define <8 x i16> @shuffle_v8i16_zzBzzzzz(i16 %i) {
	; SSE-LABEL: shuffle_v8i16_zzBzzzzz:			; SSE-LABEL: shuffle_v8i16_zzBzzzzz:
	; SSE: # BB#0:			; SSE: # BB#0:
	; SSE-NEXT: pxor %xmm0, %xmm0			; SSE-NEXT: pxor %xmm0, %xmm0
	; SSE-NEXT: pinsrw $2, %edi, %xmm0			; SSE-NEXT: pinsrw $2, %edi, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX1OR2-LABEL: shuffle_v8i16_zzBzzzzz:			; AVX1OR2-LABEL: shuffle_v8i16_zzBzzzzz:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0			; AVX1OR2-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpinsrw $2, %edi, %xmm0, %xmm0			; AVX1OR2-NEXT: vpinsrw $2, %edi, %xmm0, %xmm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i16_zzBzzzzz:			; AVX512VL-LABEL: shuffle_v8i16_zzBzzzzz:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: vpinsrw $2, %edi, %xmm0, %xmm0			; AVX512VL-NEXT: vpinsrw $2, %edi, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%a = insertelement <8 x i16> undef, i16 %i, i32 3			%a = insertelement <8 x i16> undef, i16 %i, i32 3
	%shuffle = shufflevector <8 x i16> zeroinitializer, <8 x i16> %a, <8 x i32> <i32 0, i32 1, i32 11, i32 3, i32 4, i32 5, i32 6, i32 7>			%shuffle = shufflevector <8 x i16> zeroinitializer, <8 x i16> %a, <8 x i32> <i32 0, i32 1, i32 11, i32 3, i32 4, i32 5, i32 6, i32 7>
	ret <8 x i16> %shuffle			ret <8 x i16> %shuffle
	}			}

	define <8 x i16> @shuffle_v8i16_def01234(<8 x i16> %a, <8 x i16> %b) {			define <8 x i16> @shuffle_v8i16_def01234(<8 x i16> %a, <8 x i16> %b) {
	▲ Show 20 Lines • Show All 601 Lines • ▼ Show 20 Lines
	; AVX1OR2-LABEL: shuffle_v8i16_0z234567:			; AVX1OR2-LABEL: shuffle_v8i16_0z234567:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX1OR2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX1OR2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6,7]			; AVX1OR2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6,7]
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i16_0z234567:			; AVX512VL-LABEL: shuffle_v8i16_0z234567:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6,7]			; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6,7]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i16> %a, <8 x i16> zeroinitializer, <8 x i32> <i32 0, i32 9, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>			%shuffle = shufflevector <8 x i16> %a, <8 x i16> zeroinitializer, <8 x i32> <i32 0, i32 9, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
	ret <8 x i16> %shuffle			ret <8 x i16> %shuffle
	}			}

	define <8 x i16> @shuffle_v8i16_0zzzz5z7(<8 x i16> %a) {			define <8 x i16> @shuffle_v8i16_0zzzz5z7(<8 x i16> %a) {
	; SSE2-LABEL: shuffle_v8i16_0zzzz5z7:			; SSE2-LABEL: shuffle_v8i16_0zzzz5z7:
	Show All 15 Lines
	; AVX1OR2-LABEL: shuffle_v8i16_0zzzz5z7:			; AVX1OR2-LABEL: shuffle_v8i16_0zzzz5z7:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX1OR2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX1OR2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3,4],xmm0[5],xmm1[6],xmm0[7]			; AVX1OR2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3,4],xmm0[5],xmm1[6],xmm0[7]
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i16_0zzzz5z7:			; AVX512VL-LABEL: shuffle_v8i16_0zzzz5z7:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3,4],xmm0[5],xmm1[6],xmm0[7]			; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1,2,3,4],xmm0[5],xmm1[6],xmm0[7]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i16> %a, <8 x i16> zeroinitializer, <8 x i32> <i32 0, i32 8, i32 8, i32 8, i32 8, i32 5, i32 8, i32 7>			%shuffle = shufflevector <8 x i16> %a, <8 x i16> zeroinitializer, <8 x i32> <i32 0, i32 8, i32 8, i32 8, i32 8, i32 5, i32 8, i32 7>
	ret <8 x i16> %shuffle			ret <8 x i16> %shuffle
	}			}

	define <8 x i16> @shuffle_v8i16_0123456z(<8 x i16> %a) {			define <8 x i16> @shuffle_v8i16_0123456z(<8 x i16> %a) {
	; SSE2-LABEL: shuffle_v8i16_0123456z:			; SSE2-LABEL: shuffle_v8i16_0123456z:
	Show All 15 Lines
	; AVX1OR2-LABEL: shuffle_v8i16_0123456z:			; AVX1OR2-LABEL: shuffle_v8i16_0123456z:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX1OR2-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX1OR2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6],xmm1[7]			; AVX1OR2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6],xmm1[7]
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i16_0123456z:			; AVX512VL-LABEL: shuffle_v8i16_0123456z:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6],xmm1[7]			; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6],xmm1[7]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i16> %a, <8 x i16> zeroinitializer, <8 x i32> <i32 0, i32 9, i32 2, i32 3, i32 4, i32 5, i32 6, i32 15>			%shuffle = shufflevector <8 x i16> %a, <8 x i16> zeroinitializer, <8 x i32> <i32 0, i32 9, i32 2, i32 3, i32 4, i32 5, i32 6, i32 15>
	ret <8 x i16> %shuffle			ret <8 x i16> %shuffle
	}			}

	define <8 x i16> @shuffle_v8i16_fu3ucc5u(<8 x i16> %a, <8 x i16> %b) {			define <8 x i16> @shuffle_v8i16_fu3ucc5u(<8 x i16> %a, <8 x i16> %b) {
	; SSE-LABEL: shuffle_v8i16_fu3ucc5u:			; SSE-LABEL: shuffle_v8i16_fu3ucc5u:
	▲ Show 20 Lines • Show All 314 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v16.ll

	Show First 20 Lines • Show All 164 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,0,1,1,4,4,5,5]			; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,0,1,1,4,4,5,5]
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_00_00_00_08_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_00_00_00_08_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm1 = [0,0,0,0,0,0,0,8,0,0,0,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm1 = [0,0,0,0,0,0,0,8,0,0,0,0,0,0,0,0]
	; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 8, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 8, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_00_00_00_00_09_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_00_00_00_00_09_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_00_09_00_00_00_00_00_00_00_00_00:			; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_00_09_00_00_00_00_00_00_00_00_00:
	Show All 11 Lines
	; AVX2-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX2-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>
	; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,0,1,0,1,0,1,0,1,2,3,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,0,1,0,1,0,1,0,1,2,3,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_00_00_09_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_00_00_09_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm1 = [0,0,0,0,0,0,9,0,0,0,0,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm1 = [0,0,0,0,0,0,9,0,0,0,0,0,0,0,0,0]
	; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 9, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 9, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_00_00_00_10_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_00_00_00_10_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_10_00_00_00_00_00_00_00_00_00_00:			; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_10_00_00_00_00_00_00_00_00_00_00:
	Show All 10 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX2-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,0,1,0,1,0,1,4,5,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,0,1,0,1,0,1,4,5,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_00_10_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_00_10_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm1 = [0,0,0,0,0,10,0,0,0,0,0,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm1 = [0,0,0,0,0,10,0,0,0,0,0,0,0,0,0,0]
	; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 10, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 10, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_00_00_11_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_00_00_11_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_00_00_11_00_00_00_00_00_00_00_00_00_00_00:			; AVX1-LABEL: shuffle_v16i16_00_00_00_00_11_00_00_00_00_00_00_00_00_00_00_00:
	Show All 10 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX2-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,0,1,0,1,6,7,0,1,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,0,1,0,1,6,7,0,1,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_11_00_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_11_00_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm1 = [0,0,0,0,11,0,0,0,0,0,0,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm1 = [0,0,0,0,11,0,0,0,0,0,0,0,0,0,0,0]
	; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 11, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 11, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_00_12_00_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_00_12_00_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_00_12_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX1-LABEL: shuffle_v16i16_00_00_00_12_00_00_00_00_00_00_00_00_00_00_00_00:
	Show All 9 Lines
	; AVX2-LABEL: shuffle_v16i16_00_00_00_12_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX2-LABEL: shuffle_v16i16_00_00_00_12_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,3,0,1]			; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,3,0,1]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,0,1,8,9,0,1,0,1,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,0,1,8,9,0,1,0,1,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_00_00_12_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v16i16_00_00_00_12_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm1 = [0,0,0,12,0,0,0,0,0,0,0,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm1 = [0,0,0,12,0,0,0,0,0,0,0,0,0,0,0,0]
	; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 12, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 12, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_13_00_00_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_13_00_00_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_13_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX1-LABEL: shuffle_v16i16_00_00_13_00_00_00_00_00_00_00_00_00_00_00_00_00:
	Show All 9 Lines
	; AVX2-LABEL: shuffle_v16i16_00_00_13_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX2-LABEL: shuffle_v16i16_00_00_13_00_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,3,0,1]			; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,3,0,1]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,10,11,0,1,0,1,0,1,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,0,1,10,11,0,1,0,1,0,1,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_00_13_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v16i16_00_00_13_00_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm1 = [0,0,13,0,0,0,0,0,0,0,0,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm1 = [0,0,13,0,0,0,0,0,0,0,0,0,0,0,0,0]
	; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 13, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 13, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_14_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_14_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_14_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX1-LABEL: shuffle_v16i16_00_14_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	Show All 9 Lines
	; AVX2-LABEL: shuffle_v16i16_00_14_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX2-LABEL: shuffle_v16i16_00_14_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,3,0,1]			; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,3,0,1]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,12,13,0,1,0,1,0,1,0,1,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,12,13,0,1,0,1,0,1,0,1,0,1,0,1,16,17,16,17,16,17,16,17,16,17,16,17,16,17,16,17]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_14_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v16i16_00_14_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm1 = [0,14,0,0,0,0,0,0,0,0,0,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm1 = [0,14,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
	; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermw %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 14, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 14, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_15_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_15_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_15_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX1-LABEL: shuffle_v16i16_15_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	▲ Show 20 Lines • Show All 370 Lines • ▼ Show 20 Lines
	; AVX2-LABEL: shuffle_v16i16_00_01_02_03_04_05_06_07_08_09_10_11_12_13_14_31:			; AVX2-LABEL: shuffle_v16i16_00_01_02_03_04_05_06_07_08_09_10_11_12_13_14_31:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,0,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,0,0]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_01_02_03_04_05_06_07_08_09_10_11_12_13_14_31:			; AVX512VL-LABEL: shuffle_v16i16_00_01_02_03_04_05_06_07_08_09_10_11_12_13_14_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,0,0]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 31>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 31>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_16_01_02_03_04_05_06_07_08_09_10_11_12_13_14_15(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_16_01_02_03_04_05_06_07_08_09_10_11_12_13_14_15(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_16_01_02_03_04_05_06_07_08_09_10_11_12_13_14_15:			; AVX1-LABEL: shuffle_v16i16_16_01_02_03_04_05_06_07_08_09_10_11_12_13_14_15:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vmovaps {{.*#+}} ymm2 = [0,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535]			; AVX1-NEXT: vmovaps {{.*#+}} ymm2 = [0,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535,65535]
	; AVX1-NEXT: vandnps %ymm1, %ymm2, %ymm1			; AVX1-NEXT: vandnps %ymm1, %ymm2, %ymm1
	; AVX1-NEXT: vandps %ymm2, %ymm0, %ymm0			; AVX1-NEXT: vandps %ymm2, %ymm0, %ymm0
	; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0			; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v16i16_16_01_02_03_04_05_06_07_08_09_10_11_12_13_14_15:			; AVX2-LABEL: shuffle_v16i16_16_01_02_03_04_05_06_07_08_09_10_11_12_13_14_15:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_16_01_02_03_04_05_06_07_08_09_10_11_12_13_14_15:			; AVX512VL-LABEL: shuffle_v16i16_16_01_02_03_04_05_06_07_08_09_10_11_12_13_14_15:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 16, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 16, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_17_02_19_04_21_06_23_24_09_26_11_28_13_30_15(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_17_02_19_04_21_06_23_24_09_26_11_28_13_30_15(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_17_02_19_04_21_06_23_24_09_26_11_28_13_30_15:			; AVX1-LABEL: shuffle_v16i16_00_17_02_19_04_21_06_23_24_09_26_11_28_13_30_15:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vmovaps {{.*#+}} ymm2 = [65535,0,65535,0,65535,0,65535,0,0,65535,0,65535,0,65535,0,65535]			; AVX1-NEXT: vmovaps {{.*#+}} ymm2 = [65535,0,65535,0,65535,0,65535,0,0,65535,0,65535,0,65535,0,65535]
	; AVX1-NEXT: vandnps %ymm1, %ymm2, %ymm1			; AVX1-NEXT: vandnps %ymm1, %ymm2, %ymm1
	; AVX1-NEXT: vandps %ymm2, %ymm0, %ymm0			; AVX1-NEXT: vandps %ymm2, %ymm0, %ymm0
	; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0			; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v16i16_00_17_02_19_04_21_06_23_24_09_26_11_28_13_30_15:			; AVX2-LABEL: shuffle_v16i16_00_17_02_19_04_21_06_23_24_09_26_11_28_13_30_15:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,0,0,255,255,0,0,255,255,0,0,255,255,0,0,0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,0,0,255,255,0,0,255,255,0,0,255,255,0,0,0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_17_02_19_04_21_06_23_24_09_26_11_28_13_30_15:			; AVX512VL-LABEL: shuffle_v16i16_00_17_02_19_04_21_06_23_24_09_26_11_28_13_30_15:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,255,0,0,255,255,0,0,255,255,0,0,255,255,0,0,0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [255,255,0,0,255,255,0,0,255,255,0,0,255,255,0,0,0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 24, i32 9, i32 26, i32 11, i32 28, i32 13, i32 30, i32 15>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 24, i32 9, i32 26, i32 11, i32 28, i32 13, i32 30, i32 15>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_16_01_18_03_20_05_22_07_08_25_10_27_12_29_14_31(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_16_01_18_03_20_05_22_07_08_25_10_27_12_29_14_31(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_16_01_18_03_20_05_22_07_08_25_10_27_12_29_14_31:			; AVX1-LABEL: shuffle_v16i16_16_01_18_03_20_05_22_07_08_25_10_27_12_29_14_31:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vmovaps {{.*#+}} ymm2 = [0,65535,0,65535,0,65535,0,65535,65535,0,65535,0,65535,0,65535,0]			; AVX1-NEXT: vmovaps {{.*#+}} ymm2 = [0,65535,0,65535,0,65535,0,65535,65535,0,65535,0,65535,0,65535,0]
	; AVX1-NEXT: vandnps %ymm1, %ymm2, %ymm1			; AVX1-NEXT: vandnps %ymm1, %ymm2, %ymm1
	; AVX1-NEXT: vandps %ymm2, %ymm0, %ymm0			; AVX1-NEXT: vandps %ymm2, %ymm0, %ymm0
	; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0			; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v16i16_16_01_18_03_20_05_22_07_08_25_10_27_12_29_14_31:			; AVX2-LABEL: shuffle_v16i16_16_01_18_03_20_05_22_07_08_25_10_27_12_29_14_31:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255,255,255,0,0,255,255,0,0,255,255,0,0,255,255,0,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255,255,255,0,0,255,255,0,0,255,255,0,0,255,255,0,0]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_16_01_18_03_20_05_22_07_08_25_10_27_12_29_14_31:			; AVX512VL-LABEL: shuffle_v16i16_16_01_18_03_20_05_22_07_08_25_10_27_12_29_14_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255,255,255,0,0,255,255,0,0,255,255,0,0,255,255,0,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255,255,255,0,0,255,255,0,0,255,255,0,0,255,255,0,0]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 16, i32 1, i32 18, i32 3, i32 20, i32 5, i32 22, i32 7, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 16, i32 1, i32 18, i32 3, i32 20, i32 5, i32 22, i32 7, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_01_18_19_20_21_06_07_08_09_26_27_12_13_30_31(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_01_18_19_20_21_06_07_08_09_26_27_12_13_30_31(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_01_18_19_20_21_06_07_08_09_26_27_12_13_30_31:			; AVX1-LABEL: shuffle_v16i16_00_01_18_19_20_21_06_07_08_09_26_27_12_13_30_31:
	▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,0,0,0,4,4,4,4]			; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,0,0,0,4,4,4,4]
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm1 = ymm1[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm1 = ymm1[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[0,0,1,1,4,4,5,5]			; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[0,0,1,1,4,4,5,5]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_16_00_16_00_16_00_16_08_24_08_24_08_24_08_24:			; AVX512VL-LABEL: shuffle_v16i16_00_16_00_16_00_16_00_16_08_24_08_24_08_24_08_24:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,16,0,16,0,16,0,16,8,24,8,24,8,24,8,24]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,16,0,16,0,16,0,16,8,24,8,24,8,24,8,24]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 0, i32 16, i32 0, i32 16, i32 0, i32 16, i32 8, i32 24, i32 8, i32 24, i32 8, i32 24, i32 8, i32 24>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 0, i32 16, i32 0, i32 16, i32 0, i32 16, i32 8, i32 24, i32 8, i32 24, i32 8, i32 24, i32 8, i32 24>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_16_16_16_16_04_05_06_07_24_24_24_24_12_13_14_15(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_16_16_16_16_04_05_06_07_24_24_24_24_12_13_14_15(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_16_16_16_16_04_05_06_07_24_24_24_24_12_13_14_15:			; AVX1-LABEL: shuffle_v16i16_16_16_16_16_04_05_06_07_24_24_24_24_12_13_14_15:
	Show All 10 Lines
	; AVX2-LABEL: shuffle_v16i16_16_16_16_16_04_05_06_07_24_24_24_24_12_13_14_15:			; AVX2-LABEL: shuffle_v16i16_16_16_16_16_04_05_06_07_24_24_24_24_12_13_14_15:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm1 = ymm1[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm1 = ymm1[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_16_16_16_16_04_05_06_07_24_24_24_24_12_13_14_15:			; AVX512VL-LABEL: shuffle_v16i16_16_16_16_16_04_05_06_07_24_24_24_24_12_13_14_15:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,0,0,0,20,21,22,23,8,8,8,8,28,29,30,31]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,0,0,0,20,21,22,23,8,8,8,8,28,29,30,31]
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 16, i32 16, i32 16, i32 16, i32 4, i32 5, i32 6, i32 7, i32 24, i32 24, i32 24, i32 24, i32 12, i32 13, i32 14, i32 15>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 16, i32 16, i32 16, i32 16, i32 4, i32 5, i32 6, i32 7, i32 24, i32 24, i32 24, i32 24, i32 12, i32 13, i32 14, i32 15>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_19_18_17_16_07_06_05_04_27_26_25_24_15_14_13_12(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_19_18_17_16_07_06_05_04_27_26_25_24_15_14_13_12(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_19_18_17_16_07_06_05_04_27_26_25_24_15_14_13_12:			; AVX1-LABEL: shuffle_v16i16_19_18_17_16_07_06_05_04_27_26_25_24_15_14_13_12:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	Show All 14 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[3,2,1,0,4,5,6,7,11,10,9,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[3,2,1,0,4,5,6,7,11,10,9,8,12,13,14,15]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,7,6,5,4,8,9,10,11,15,14,13,12]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,7,6,5,4,8,9,10,11,15,14,13,12]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_19_18_17_16_07_06_05_04_27_26_25_24_15_14_13_12:			; AVX512VL-LABEL: shuffle_v16i16_19_18_17_16_07_06_05_04_27_26_25_24_15_14_13_12:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [3,2,1,0,23,22,21,20,11,10,9,8,31,30,29,28]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [3,2,1,0,23,22,21,20,11,10,9,8,31,30,29,28]
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 19, i32 18, i32 17, i32 16, i32 7, i32 6, i32 5, i32 4, i32 27, i32 26, i32 25, i32 24, i32 15, i32 14, i32 13, i32 12>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 19, i32 18, i32 17, i32 16, i32 7, i32 6, i32 5, i32 4, i32 27, i32 26, i32 25, i32 24, i32 15, i32 14, i32 13, i32 12>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_19_18_17_16_03_02_01_00_27_26_25_24_11_10_09_08(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_19_18_17_16_03_02_01_00_27_26_25_24_11_10_09_08(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_19_18_17_16_03_02_01_00_27_26_25_24_11_10_09_08:			; AVX1-LABEL: shuffle_v16i16_19_18_17_16_03_02_01_00_27_26_25_24_11_10_09_08:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	Show All 12 Lines
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm1 = ymm1[3,2,1,0,4,5,6,7,11,10,9,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm1 = ymm1[3,2,1,0,4,5,6,7,11,10,9,8,12,13,14,15]
	; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,1,0,1,4,5,4,5]			; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,1,0,1,4,5,4,5]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,7,6,5,4,8,9,10,11,15,14,13,12]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,7,6,5,4,8,9,10,11,15,14,13,12]
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_19_18_17_16_03_02_01_00_27_26_25_24_11_10_09_08:			; AVX512VL-LABEL: shuffle_v16i16_19_18_17_16_03_02_01_00_27_26_25_24_11_10_09_08:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [3,2,1,0,19,18,17,16,11,10,9,8,27,26,25,24]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [3,2,1,0,19,18,17,16,11,10,9,8,27,26,25,24]
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 19, i32 18, i32 17, i32 16, i32 3, i32 2, i32 1, i32 0, i32 27, i32 26, i32 25, i32 24, i32 11, i32 10, i32 9, i32 8>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 19, i32 18, i32 17, i32 16, i32 3, i32 2, i32 1, i32 0, i32 27, i32 26, i32 25, i32 24, i32 11, i32 10, i32 9, i32 8>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_00_00_00_00_01_00_08_08_08_08_08_08_09_08(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_00_00_00_00_01_00_08_08_08_08_08_08_09_08(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_00_01_00_08_08_08_08_08_08_09_08:			; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_00_01_00_08_08_08_08_08_08_09_08:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	▲ Show 20 Lines • Show All 170 Lines • ▼ Show 20 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[u,u,0,1,u,u,2,3,u,u,4,5,u,u,6,7,u,u,24,25,u,u,26,27,u,u,28,29,u,u,30,31]			; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[u,u,0,1,u,u,2,3,u,u,4,5,u,u,6,7,u,u,24,25,u,u,26,27,u,u,28,29,u,u,30,31]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,u,u,2,3,u,u,4,5,u,u,6,7,u,u,24,25,u,u,26,27,u,u,28,29,u,u,30,31,u,u]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,u,u,2,3,u,u,4,5,u,u,6,7,u,u,24,25,u,u,26,27,u,u,28,29,u,u,30,31,u,u]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_19_12_28_13_29_14_30_15_31:			; AVX512VL-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_19_12_28_13_29_14_30_15_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,16,1,17,2,18,3,19,12,28,13,29,14,30,15,31]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,16,1,17,2,18,3,19,12,28,13,29,14,30,15,31]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 1, i32 17, i32 2, i32 18, i32 3, i32 19, i32 12, i32 28, i32 13, i32 29, i32 14, i32 30, i32 15, i32 31>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 1, i32 17, i32 2, i32 18, i32 3, i32 19, i32 12, i32 28, i32 13, i32 29, i32 14, i32 30, i32 15, i32 31>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_04_20_05_21_06_22_07_23_08_24_09_25_10_26_11_27(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_04_20_05_21_06_22_07_23_08_24_09_25_10_26_11_27(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_04_20_05_21_06_22_07_23_08_24_09_25_10_26_11_27:			; AVX1-LABEL: shuffle_v16i16_04_20_05_21_06_22_07_23_08_24_09_25_10_26_11_27:
	Show All 9 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[u,u,8,9,u,u,10,11,u,u,12,13,u,u,14,15,u,u,16,17,u,u,18,19,u,u,20,21,u,u,22,23]			; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[u,u,8,9,u,u,10,11,u,u,12,13,u,u,14,15,u,u,16,17,u,u,18,19,u,u,20,21,u,u,22,23]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[8,9,u,u,10,11,u,u,12,13,u,u,14,15,u,u,16,17,u,u,18,19,u,u,20,21,u,u,22,23,u,u]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[8,9,u,u,10,11,u,u,12,13,u,u,14,15,u,u,16,17,u,u,18,19,u,u,20,21,u,u,22,23,u,u]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_04_20_05_21_06_22_07_23_08_24_09_25_10_26_11_27:			; AVX512VL-LABEL: shuffle_v16i16_04_20_05_21_06_22_07_23_08_24_09_25_10_26_11_27:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [4,20,5,21,6,22,7,23,8,24,9,25,10,26,11,27]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [4,20,5,21,6,22,7,23,8,24,9,25,10,26,11,27]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 20, i32 5, i32 21, i32 6, i32 22, i32 7, i32 23, i32 8, i32 24, i32 9, i32 25, i32 10, i32 26, i32 11, i32 27>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 20, i32 5, i32 21, i32 6, i32 22, i32 7, i32 23, i32 8, i32 24, i32 9, i32 25, i32 10, i32 26, i32 11, i32 27>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_00_00_00_00_01_00_08_09_08_08_08_08_08_08(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_00_00_00_00_01_00_08_09_08_08_08_08_08_08(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_00_01_00_08_09_08_08_08_08_08_08:			; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_00_01_00_08_09_08_08_08_08_08_08:
	▲ Show 20 Lines • Show All 236 Lines • ▼ Show 20 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,4,4,4,8,9,10,11,12,12,12,12]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,4,4,4,8,9,10,11,12,12,12,12]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_04_04_04_04_16_16_16_16_20_20_20_20:			; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_04_04_04_04_16_16_16_16_20_20_20_20:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,0,0,0,4,4,4,4,16,16,16,16,20,20,20,20]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,0,0,0,4,4,4,4,16,16,16,16,20,20,20,20]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 4, i32 4, i32 4, i32 4, i32 16, i32 16, i32 16, i32 16, i32 20, i32 20, i32 20, i32 20>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 4, i32 4, i32 4, i32 4, i32 16, i32 16, i32 16, i32 16, i32 20, i32 20, i32 20, i32 20>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_08_08_08_08_12_12_12_12_16_16_16_16_20_20_20_20(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_08_08_08_08_12_12_12_12_16_16_16_16_20_20_20_20(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_08_08_08_08_12_12_12_12_16_16_16_16_20_20_20_20:			; AVX1-LABEL: shuffle_v16i16_08_08_08_08_12_12_12_12_16_16_16_16_20_20_20_20:
	Show All 10 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vperm2i128 {{.*#+}} ymm0 = ymm0[2,3],ymm1[0,1]			; AVX2-NEXT: vperm2i128 {{.*#+}} ymm0 = ymm0[2,3],ymm1[0,1]
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,4,4,4,8,9,10,11,12,12,12,12]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,4,4,4,8,9,10,11,12,12,12,12]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_08_08_08_08_12_12_12_12_16_16_16_16_20_20_20_20:			; AVX512VL-LABEL: shuffle_v16i16_08_08_08_08_12_12_12_12_16_16_16_16_20_20_20_20:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [8,8,8,8,12,12,12,12,16,16,16,16,20,20,20,20]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [8,8,8,8,12,12,12,12,16,16,16,16,20,20,20,20]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 8, i32 8, i32 8, i32 8, i32 12, i32 12, i32 12, i32 12, i32 16, i32 16, i32 16, i32 16, i32 20, i32 20, i32 20, i32 20>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 8, i32 8, i32 8, i32 8, i32 12, i32 12, i32 12, i32 12, i32 16, i32 16, i32 16, i32 16, i32 20, i32 20, i32 20, i32 20>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_08_08_08_08_12_12_12_12_24_24_24_24_28_28_28_28(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_08_08_08_08_12_12_12_12_24_24_24_24_28_28_28_28(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_08_08_08_08_12_12_12_12_24_24_24_24_28_28_28_28:			; AVX1-LABEL: shuffle_v16i16_08_08_08_08_12_12_12_12_24_24_24_24_28_28_28_28:
	Show All 11 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vperm2i128 {{.*#+}} ymm0 = ymm0[2,3],ymm1[2,3]			; AVX2-NEXT: vperm2i128 {{.*#+}} ymm0 = ymm0[2,3],ymm1[2,3]
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,4,4,4,8,9,10,11,12,12,12,12]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,4,4,4,8,9,10,11,12,12,12,12]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_08_08_08_08_12_12_12_12_24_24_24_24_28_28_28_28:			; AVX512VL-LABEL: shuffle_v16i16_08_08_08_08_12_12_12_12_24_24_24_24_28_28_28_28:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [8,8,8,8,12,12,12,12,24,24,24,24,28,28,28,28]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [8,8,8,8,12,12,12,12,24,24,24,24,28,28,28,28]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 8, i32 8, i32 8, i32 8, i32 12, i32 12, i32 12, i32 12, i32 24, i32 24, i32 24, i32 24, i32 28, i32 28, i32 28, i32 28>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 8, i32 8, i32 8, i32 8, i32 12, i32 12, i32 12, i32 12, i32 24, i32 24, i32 24, i32 24, i32 28, i32 28, i32 28, i32 28>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_00_00_04_04_04_04_24_24_24_24_28_28_28_28(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_00_00_04_04_04_04_24_24_24_24_28_28_28_28(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_00_00_04_04_04_04_24_24_24_24_28_28_28_28:			; AVX1-LABEL: shuffle_v16i16_00_00_00_00_04_04_04_04_24_24_24_24_28_28_28_28:
	Show All 10 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,4,4,4,8,9,10,11,12,12,12,12]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,4,4,4,8,9,10,11,12,12,12,12]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_04_04_04_04_24_24_24_24_28_28_28_28:			; AVX512VL-LABEL: shuffle_v16i16_00_00_00_00_04_04_04_04_24_24_24_24_28_28_28_28:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,0,0,0,4,4,4,4,24,24,24,24,28,28,28,28]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,0,0,0,4,4,4,4,24,24,24,24,28,28,28,28]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 4, i32 4, i32 4, i32 4, i32 24, i32 24, i32 24, i32 24, i32 28, i32 28, i32 28, i32 28>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 0, i32 0, i32 0, i32 4, i32 4, i32 4, i32 4, i32 24, i32 24, i32 24, i32 24, i32 28, i32 28, i32 28, i32 28>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_16_01_17_02_18_03_19_04_20_05_21_06_22_07_23(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_16_01_17_02_18_03_19_04_20_05_21_06_22_07_23(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_19_04_20_05_21_06_22_07_23:			; AVX1-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_19_04_20_05_21_06_22_07_23:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpunpckhwd {{.*#+}} xmm2 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]			; AVX1-NEXT: vpunpckhwd {{.*#+}} xmm2 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]
	; AVX1-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3]			; AVX1-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3]
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0			; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_19_04_20_05_21_06_22_07_23:			; AVX2-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_19_04_20_05_21_06_22_07_23:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpunpckhwd {{.*#+}} xmm2 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]			; AVX2-NEXT: vpunpckhwd {{.*#+}} xmm2 = xmm0[4],xmm1[4],xmm0[5],xmm1[5],xmm0[6],xmm1[6],xmm0[7],xmm1[7]
	; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3]			; AVX2-NEXT: vpunpcklwd {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3]
	; AVX2-NEXT: vinserti128 $1, %xmm2, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm2, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_19_04_20_05_21_06_22_07_23:			; AVX512VL-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_19_04_20_05_21_06_22_07_23:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,16,1,17,2,18,3,19,4,20,5,21,6,22,7,23]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,16,1,17,2,18,3,19,4,20,5,21,6,22,7,23]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 1, i32 17, i32 2, i32 18, i32 3, i32 19, i32 4, i32 20, i32 5, i32 21, i32 6, i32 22, i32 7, i32 23>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 1, i32 17, i32 2, i32 18, i32 3, i32 19, i32 4, i32 20, i32 5, i32 21, i32 6, i32 22, i32 7, i32 23>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_zz_zz_zz_zz_zz_zz_zz_16_zz_zz_zz_zz_zz_zz_zz_24(<16 x i16> %a) {			define <16 x i16> @shuffle_v16i16_zz_zz_zz_zz_zz_zz_zz_16_zz_zz_zz_zz_zz_zz_zz_24(<16 x i16> %a) {
	; AVX1-LABEL: shuffle_v16i16_zz_zz_zz_zz_zz_zz_zz_16_zz_zz_zz_zz_zz_zz_zz_24:			; AVX1-LABEL: shuffle_v16i16_zz_zz_zz_zz_zz_zz_zz_16_zz_zz_zz_zz_zz_zz_zz_24:
	▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines
	; AVX2-LABEL: shuffle_v16i16_01_02_03_04_05_06_07_00_17_18_19_20_21_22_23_16:			; AVX2-LABEL: shuffle_v16i16_01_02_03_04_05_06_07_00_17_18_19_20_21_22_23_16:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: vpalignr {{.*#+}} ymm0 = ymm0[2,3,4,5,6,7,8,9,10,11,12,13,14,15,0,1,18,19,20,21,22,23,24,25,26,27,28,29,30,31,16,17]			; AVX2-NEXT: vpalignr {{.*#+}} ymm0 = ymm0[2,3,4,5,6,7,8,9,10,11,12,13,14,15,0,1,18,19,20,21,22,23,24,25,26,27,28,29,30,31,16,17]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_01_02_03_04_05_06_07_00_17_18_19_20_21_22_23_16:			; AVX512VL-LABEL: shuffle_v16i16_01_02_03_04_05_06_07_00_17_18_19_20_21_22_23_16:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [1,2,3,4,5,6,7,0,17,18,19,20,21,22,23,16]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [1,2,3,4,5,6,7,0,17,18,19,20,21,22,23,16]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 16>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 16>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_07_00_01_02_03_04_05_06_23_16_17_18_19_20_21_22(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_07_00_01_02_03_04_05_06_23_16_17_18_19_20_21_22(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_07_00_01_02_03_04_05_06_23_16_17_18_19_20_21_22:			; AVX1-LABEL: shuffle_v16i16_07_00_01_02_03_04_05_06_23_16_17_18_19_20_21_22:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13]			; AVX1-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13]
	; AVX1-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13]			; AVX1-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13]
	; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0			; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v16i16_07_00_01_02_03_04_05_06_23_16_17_18_19_20_21_22:			; AVX2-LABEL: shuffle_v16i16_07_00_01_02_03_04_05_06_23_16_17_18_19_20_21_22:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: vpalignr {{.*#+}} ymm0 = ymm0[14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13,30,31,16,17,18,19,20,21,22,23,24,25,26,27,28,29]			; AVX2-NEXT: vpalignr {{.*#+}} ymm0 = ymm0[14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13,30,31,16,17,18,19,20,21,22,23,24,25,26,27,28,29]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_07_00_01_02_03_04_05_06_23_16_17_18_19_20_21_22:			; AVX512VL-LABEL: shuffle_v16i16_07_00_01_02_03_04_05_06_23_16_17_18_19_20_21_22:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [7,0,1,2,3,4,5,6,23,16,17,18,19,20,21,22]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [7,0,1,2,3,4,5,6,23,16,17,18,19,20,21,22]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 23, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 7, i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 23, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_01_00_01_02_03_02_11_08_09_08_09_10_11_10_11(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_01_00_01_02_03_02_11_08_09_08_09_10_11_10_11(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_01_00_01_02_03_02_11_08_09_08_09_10_11_10_11:			; AVX1-LABEL: shuffle_v16i16_00_01_00_01_02_03_02_11_08_09_08_09_10_11_10_11:
	▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]
	; AVX2-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,3,0,1]			; AVX2-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[2,3,0,1]
	; AVX2-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[2,3,0,1]			; AVX2-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[2,3,0,1]
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_04_05_06_07_16_17_18_27_12_13_14_15_24_25_26_27:			; AVX512VL-LABEL: shuffle_v16i16_04_05_06_07_16_17_18_27_12_13_14_15_24_25_26_27:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [4,5,6,7,16,17,18,27,12,13,14,15,24,25,26,27]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [4,5,6,7,16,17,18,27,12,13,14,15,24,25,26,27]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 27, i32 12, i32 13, i32 14, i32 15, i32 24, i32 25, i32 26, i32 27>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 27, i32 12, i32 13, i32 14, i32 15, i32 24, i32 25, i32 26, i32 27>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_00_00_00_00_00_00_08_08_08_08_08_08_08_08_08(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_00_00_00_00_00_00_08_08_08_08_08_08_08_08_08(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_00_00_08_08_08_08_08_08_08_08_08:			; AVX1-LABEL: shuffle_v16i16_00_00_00_00_00_00_00_08_08_08_08_08_08_08_08_08:
	▲ Show 20 Lines • Show All 327 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm1[0],xmm0[1,2,3,4,5,6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm1[0],xmm0[1,2,3,4,5,6,7]
	; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_07_05_06_04_03_01_02_08_15_13_14_12_11_09_10_08:			; AVX512VL-LABEL: shuffle_v16i16_07_05_06_04_03_01_02_08_15_13_14_12_11_09_10_08:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [14,15,10,11,12,13,8,9,6,7,2,3,4,5,0,1]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [14,15,10,11,12,13,8,9,6,7,2,3,4,5,0,1]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3			; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3
	; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm1[0],xmm0[1,2,3,4,5,6,7]			; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm1[0],xmm0[1,2,3,4,5,6,7]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0			; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 7, i32 5, i32 6, i32 4, i32 3, i32 1, i32 2, i32 8, i32 15, i32 13, i32 14, i32 12, i32 11, i32 9, i32 10, i32 8>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 7, i32 5, i32 6, i32 4, i32 3, i32 1, i32 2, i32 8, i32 15, i32 13, i32 14, i32 12, i32 11, i32 9, i32 10, i32 8>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}
	▲ Show 20 Lines • Show All 187 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]
	; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_02_06_04_00_05_01_07_11_10_14_12_08_13_09_15_11:			; AVX512VL-LABEL: shuffle_v16i16_02_06_04_00_05_01_07_11_10_14_12_08_13_09_15_11:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [4,5,12,13,8,9,0,1,10,11,2,3,14,15,6,7]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [4,5,12,13,8,9,0,1,10,11,2,3,14,15,6,7]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3			; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3
	; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]			; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0			; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 2, i32 6, i32 4, i32 0, i32 5, i32 1, i32 7, i32 11, i32 10, i32 14, i32 12, i32 8, i32 13, i32 9, i32 15, i32 11>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 2, i32 6, i32 4, i32 0, i32 5, i32 1, i32 7, i32 11, i32 10, i32 14, i32 12, i32 8, i32 13, i32 9, i32 15, i32 11>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}
	Show All 17 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]
	; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_02_00_06_04_05_01_07_11_10_08_14_12_13_09_15_11:			; AVX512VL-LABEL: shuffle_v16i16_02_00_06_04_05_01_07_11_10_08_14_12_13_09_15_11:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [4,5,0,1,12,13,8,9,10,11,2,3,14,15,6,7]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [4,5,0,1,12,13,8,9,10,11,2,3,14,15,6,7]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3			; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3
	; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]			; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2],xmm1[3],xmm0[4,5,6,7]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0			; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 2, i32 0, i32 6, i32 4, i32 5, i32 1, i32 7, i32 11, i32 10, i32 8, i32 14, i32 12, i32 13, i32 9, i32 15, i32 11>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 2, i32 0, i32 6, i32 4, i32 5, i32 1, i32 7, i32 11, i32 10, i32 8, i32 14, i32 12, i32 13, i32 9, i32 15, i32 11>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}
	Show All 17 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3,4],xmm1[5],xmm0[6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3,4],xmm1[5],xmm0[6,7]
	; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_02_06_04_00_01_03_07_13_10_14_12_08_09_11_15_13:			; AVX512VL-LABEL: shuffle_v16i16_02_06_04_00_01_03_07_13_10_14_12_08_09_11_15_13:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [4,5,12,13,8,9,0,1,2,3,6,7,14,15,10,11]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [4,5,12,13,8,9,0,1,2,3,6,7,14,15,10,11]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3			; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3
	; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3,4],xmm1[5],xmm0[6,7]			; AVX512VL-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3,4],xmm1[5],xmm0[6,7]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0			; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 2, i32 6, i32 4, i32 0, i32 1, i32 3, i32 7, i32 13, i32 10, i32 14, i32 12, i32 8, i32 9, i32 11, i32 15, i32 13>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 2, i32 6, i32 4, i32 0, i32 1, i32 3, i32 7, i32 13, i32 10, i32 14, i32 12, i32 8, i32 9, i32 11, i32 15, i32 13>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}
	Show All 17 Lines
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3]
	; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_06_06_07_05_01_06_04_11_14_14_15_13_09_14_12_11:			; AVX512VL-LABEL: shuffle_v16i16_06_06_07_05_01_06_04_11_14_14_15_13_09_14_12_11:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [12,13,12,13,14,15,10,11,2,3,12,13,8,9,6,7]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [12,13,12,13,14,15,10,11,2,3,12,13,8,9,6,7]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3			; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0],xmm1[1],xmm0[2,3]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0			; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 6, i32 6, i32 7, i32 5, i32 1, i32 6, i32 4, i32 11, i32 14, i32 14, i32 15, i32 13, i32 9, i32 14, i32 12, i32 11>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 6, i32 6, i32 7, i32 5, i32 1, i32 6, i32 4, i32 11, i32 14, i32 14, i32 15, i32 13, i32 9, i32 14, i32 12, i32 11>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}
	▲ Show 20 Lines • Show All 463 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3]			; AVX2-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3]
	; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_03_07_01_00_02_07_03_13_11_15_09_08_10_15_11_13:			; AVX512VL-LABEL: shuffle_v16i16_03_07_01_00_02_07_03_13_11_15_09_08_10_15_11_13:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1			; AVX512VL-NEXT: vextracti32x4 $1, %ymm0, %xmm1
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} xmm2 = [6,7,14,15,2,3,0,1,4,5,14,15,6,7,10,11]			; AVX512VL-NEXT: vmovdqu {{.*#+}} xmm2 = [6,7,14,15,2,3,0,1,4,5,14,15,6,7,10,11]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3			; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm3
	; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3]			; AVX512VL-NEXT: vpblendd {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3]
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0			; AVX512VL-NEXT: vinserti32x4 $1, %xmm3, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 3, i32 7, i32 1, i32 0, i32 2, i32 7, i32 3, i32 13, i32 11, i32 15, i32 9, i32 8, i32 10, i32 15, i32 11, i32 13>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 3, i32 7, i32 1, i32 0, i32 2, i32 7, i32 3, i32 13, i32 11, i32 15, i32 9, i32 8, i32 10, i32 15, i32 11, i32 13>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}
	Show All 19 Lines
	; AVX2-NEXT: vpshufhw {{.*#+}} xmm1 = xmm1[0,1,2,3,4,4,6,7]			; AVX2-NEXT: vpshufhw {{.*#+}} xmm1 = xmm1[0,1,2,3,4,4,6,7]
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm1, %ymm1			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm1, %ymm1
	; AVX2-NEXT: vpunpcklwd {{.*#+}} ymm0 = ymm0[0,0,1,1,2,2,3,3,8,8,9,9,10,10,11,11]			; AVX2-NEXT: vpunpcklwd {{.*#+}} ymm0 = ymm0[0,0,1,1,2,2,3,3,8,8,9,9,10,10,11,11]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_27_08_24_09_25_10_26_11_27:			; AVX512VL-LABEL: shuffle_v16i16_00_16_01_17_02_18_03_27_08_24_09_25_10_26_11_27:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,16,1,17,2,18,3,27,8,24,9,25,10,26,11,27]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,16,1,17,2,18,3,27,8,24,9,25,10,26,11,27]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 1, i32 17, i32 2, i32 18, i32 3, i32 27, i32 8, i32 24, i32 9, i32 25, i32 10, i32 26, i32 11, i32 27>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 1, i32 17, i32 2, i32 18, i32 3, i32 27, i32 8, i32 24, i32 9, i32 25, i32 10, i32 26, i32 11, i32 27>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_20_01_21_02_22_03_31_08_28_09_29_10_30_11_31(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_20_01_21_02_22_03_31_08_28_09_29_10_30_11_31(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_20_01_21_02_22_03_31_08_28_09_29_10_30_11_31:			; AVX1-LABEL: shuffle_v16i16_00_20_01_21_02_22_03_31_08_28_09_29_10_30_11_31:
	Show All 16 Lines
	; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm0			; AVX2-NEXT: vextracti128 $1, %ymm0, %xmm0
	; AVX2-NEXT: vpblendw {{.*#+}} xmm2 = xmm2[0,1,2,3,4,5,6],xmm0[7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm2 = xmm2[0,1,2,3,4,5,6],xmm0[7]
	; AVX2-NEXT: vpshufb %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: vinserti128 $1, %xmm0, %ymm2, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm0, %ymm2, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_20_01_21_02_22_03_31_08_28_09_29_10_30_11_31:			; AVX512VL-LABEL: shuffle_v16i16_00_20_01_21_02_22_03_31_08_28_09_29_10_30_11_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,20,1,21,2,22,3,31,8,28,9,29,10,30,11,31]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,20,1,21,2,22,3,31,8,28,9,29,10,30,11,31]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 20, i32 1, i32 21, i32 2, i32 22, i32 3, i32 31, i32 8, i32 28, i32 9, i32 29, i32 10, i32 30, i32 11, i32 31>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 20, i32 1, i32 21, i32 2, i32 22, i32 3, i32 31, i32 8, i32 28, i32 9, i32 29, i32 10, i32 30, i32 11, i32 31>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_04_20_05_21_06_22_07_31_12_28_13_29_14_30_15_31(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_04_20_05_21_06_22_07_31_12_28_13_29_14_30_15_31(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_04_20_05_21_06_22_07_31_12_28_13_29_14_30_15_31:			; AVX1-LABEL: shuffle_v16i16_04_20_05_21_06_22_07_31_12_28_13_29_14_30_15_31:
	Show All 16 Lines
	; AVX2-NEXT: vpshufhw {{.*#+}} xmm1 = xmm1[0,1,2,3,4,4,6,7]			; AVX2-NEXT: vpshufhw {{.*#+}} xmm1 = xmm1[0,1,2,3,4,4,6,7]
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm1, %ymm1			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm1, %ymm1
	; AVX2-NEXT: vpunpckhwd {{.*#+}} ymm0 = ymm0[4,4,5,5,6,6,7,7,12,12,13,13,14,14,15,15]			; AVX2-NEXT: vpunpckhwd {{.*#+}} ymm0 = ymm0[4,4,5,5,6,6,7,7,12,12,13,13,14,14,15,15]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_04_20_05_21_06_22_07_31_12_28_13_29_14_30_15_31:			; AVX512VL-LABEL: shuffle_v16i16_04_20_05_21_06_22_07_31_12_28_13_29_14_30_15_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [4,20,5,21,6,22,7,31,12,28,13,29,14,30,15,31]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [4,20,5,21,6,22,7,31,12,28,13,29,14,30,15,31]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 20, i32 5, i32 21, i32 6, i32 22, i32 7, i32 31, i32 12, i32 28, i32 13, i32 29, i32 14, i32 30, i32 15, i32 31>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 20, i32 5, i32 21, i32 6, i32 22, i32 7, i32 31, i32 12, i32 28, i32 13, i32 29, i32 14, i32 30, i32 15, i32 31>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_04_16_05_17_06_18_07_27_12_24_13_25_14_26_15_27(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_04_16_05_17_06_18_07_27_12_24_13_25_14_26_15_27(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_04_16_05_17_06_18_07_27_12_24_13_25_14_26_15_27:			; AVX1-LABEL: shuffle_v16i16_04_16_05_17_06_18_07_27_12_24_13_25_14_26_15_27:
	Show All 16 Lines
	; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = [8,9,0,1,10,11,2,3,12,13,4,5,14,15,6,7]			; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = [8,9,0,1,10,11,2,3,12,13,4,5,14,15,6,7]
	; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1			; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_04_16_05_17_06_18_07_27_12_24_13_25_14_26_15_27:			; AVX512VL-LABEL: shuffle_v16i16_04_16_05_17_06_18_07_27_12_24_13_25_14_26_15_27:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [4,16,5,17,6,18,7,27,12,24,13,25,14,26,15,27]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [4,16,5,17,6,18,7,27,12,24,13,25,14,26,15,27]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 16, i32 5, i32 17, i32 6, i32 18, i32 7, i32 27, i32 12, i32 24, i32 13, i32 25, i32 14, i32 26, i32 15, i32 27>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 16, i32 5, i32 17, i32 6, i32 18, i32 7, i32 27, i32 12, i32 24, i32 13, i32 25, i32 14, i32 26, i32 15, i32 27>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_16_01_17_06_22_07_31_08_24_09_25_14_30_15_31(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_16_01_17_06_22_07_31_08_24_09_25_14_30_15_31(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_16_01_17_06_22_07_31_08_24_09_25_14_30_15_31:			; AVX1-LABEL: shuffle_v16i16_00_16_01_17_06_22_07_31_08_24_09_25_14_30_15_31:
	Show All 23 Lines
	; AVX2-NEXT: vinserti128 $1, %xmm2, %ymm1, %ymm1			; AVX2-NEXT: vinserti128 $1, %xmm2, %ymm1, %ymm1
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,1,1,3,4,5,6,7,8,9,9,11,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,1,1,3,4,5,6,7,8,9,9,11,12,13,14,15]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,6,5,7,7,8,9,10,11,14,13,15,15]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,6,5,7,7,8,9,10,11,14,13,15,15]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_16_01_17_06_22_07_31_08_24_09_25_14_30_15_31:			; AVX512VL-LABEL: shuffle_v16i16_00_16_01_17_06_22_07_31_08_24_09_25_14_30_15_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,16,1,17,6,22,7,31,8,24,9,25,14,30,15,31]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,16,1,17,6,22,7,31,8,24,9,25,14,30,15,31]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 1, i32 17, i32 6, i32 22, i32 7, i32 31, i32 8, i32 24, i32 9, i32 25, i32 14, i32 30, i32 15, i32 31>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 16, i32 1, i32 17, i32 6, i32 22, i32 7, i32 31, i32 8, i32 24, i32 9, i32 25, i32 14, i32 30, i32 15, i32 31>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_20_01_21_06_16_07_25_08_28_09_29_14_24_15_25(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_20_01_21_06_16_07_25_08_28_09_29_14_24_15_25(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_20_01_21_06_16_07_25_08_28_09_29_14_24_15_25:			; AVX1-LABEL: shuffle_v16i16_00_20_01_21_06_16_07_25_08_28_09_29_14_24_15_25:
	Show All 20 Lines
	; AVX2-NEXT: vinserti128 $1, %xmm4, %ymm1, %ymm1			; AVX2-NEXT: vinserti128 $1, %xmm4, %ymm1, %ymm1
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,1,1,3,4,5,6,7,8,9,9,11,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,1,1,3,4,5,6,7,8,9,9,11,12,13,14,15]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,6,5,7,7,8,9,10,11,14,13,15,15]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,6,5,7,7,8,9,10,11,14,13,15,15]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_20_01_21_06_16_07_25_08_28_09_29_14_24_15_25:			; AVX512VL-LABEL: shuffle_v16i16_00_20_01_21_06_16_07_25_08_28_09_29_14_24_15_25:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,20,1,21,6,16,7,25,8,28,9,29,14,24,15,25]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,20,1,21,6,16,7,25,8,28,9,29,14,24,15,25]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 20, i32 1, i32 21, i32 6, i32 16, i32 7, i32 25, i32 8, i32 28, i32 9, i32 29, i32 14, i32 24, i32 15, i32 25>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 20, i32 1, i32 21, i32 6, i32 16, i32 7, i32 25, i32 8, i32 28, i32 9, i32 29, i32 14, i32 24, i32 15, i32 25>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_01_00_17_16_03_02_19_26_09_08_25_24_11_10_27_26(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_01_00_17_16_03_02_19_26_09_08_25_24_11_10_27_26(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_01_00_17_16_03_02_19_26_09_08_25_24_11_10_27_26:			; AVX1-LABEL: shuffle_v16i16_01_00_17_16_03_02_19_26_09_08_25_24_11_10_27_26:
	Show All 19 Lines
	; AVX2-NEXT: vpshufb {{.*#+}} xmm2 = xmm2[0,1,2,3,2,3,0,1,8,9,10,11,6,7,4,5]			; AVX2-NEXT: vpshufb {{.*#+}} xmm2 = xmm2[0,1,2,3,2,3,0,1,8,9,10,11,6,7,4,5]
	; AVX2-NEXT: vinserti128 $1, %xmm2, %ymm1, %ymm1			; AVX2-NEXT: vinserti128 $1, %xmm2, %ymm1, %ymm1
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[2,3,0,1,4,5,6,7,6,7,4,5,4,5,6,7,18,19,16,17,20,21,22,23,22,23,20,21,20,21,22,23]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[2,3,0,1,4,5,6,7,6,7,4,5,4,5,6,7,18,19,16,17,20,21,22,23,22,23,20,21,20,21,22,23]
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_01_00_17_16_03_02_19_26_09_08_25_24_11_10_27_26:			; AVX512VL-LABEL: shuffle_v16i16_01_00_17_16_03_02_19_26_09_08_25_24_11_10_27_26:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [1,0,17,16,3,2,19,26,9,8,25,24,11,10,27,26]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [1,0,17,16,3,2,19,26,9,8,25,24,11,10,27,26]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 1, i32 0, i32 17, i32 16, i32 3, i32 2, i32 19, i32 26, i32 9, i32 8, i32 25, i32 24, i32 11, i32 10, i32 27, i32 26>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 1, i32 0, i32 17, i32 16, i32 3, i32 2, i32 19, i32 26, i32 9, i32 8, i32 25, i32 24, i32 11, i32 10, i32 27, i32 26>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_16_00_17_01_18_02_19_11_24_08_25_09_26_10_27_11(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_16_00_17_01_18_02_19_11_24_08_25_09_26_10_27_11(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_16_00_17_01_18_02_19_11_24_08_25_09_26_10_27_11:			; AVX1-LABEL: shuffle_v16i16_16_00_17_01_18_02_19_11_24_08_25_09_26_10_27_11:
	Show All 16 Lines
	; AVX2-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,4,6,7]			; AVX2-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,4,6,7]
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0
	; AVX2-NEXT: vpunpcklwd {{.*#+}} ymm1 = ymm1[0],ymm0[0],ymm1[1],ymm0[1],ymm1[2],ymm0[2],ymm1[3],ymm0[3],ymm1[8],ymm0[8],ymm1[9],ymm0[9],ymm1[10],ymm0[10],ymm1[11],ymm0[11]			; AVX2-NEXT: vpunpcklwd {{.*#+}} ymm1 = ymm1[0],ymm0[0],ymm1[1],ymm0[1],ymm1[2],ymm0[2],ymm1[3],ymm0[3],ymm1[8],ymm0[8],ymm1[9],ymm0[9],ymm1[10],ymm0[10],ymm1[11],ymm0[11]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm1[0],ymm0[1],ymm1[2],ymm0[3],ymm1[4],ymm0[5],ymm1[6],ymm0[7],ymm1[8],ymm0[9],ymm1[10],ymm0[11],ymm1[12],ymm0[13],ymm1[14],ymm0[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm1[0],ymm0[1],ymm1[2],ymm0[3],ymm1[4],ymm0[5],ymm1[6],ymm0[7],ymm1[8],ymm0[9],ymm1[10],ymm0[11],ymm1[12],ymm0[13],ymm1[14],ymm0[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_16_00_17_01_18_02_19_11_24_08_25_09_26_10_27_11:			; AVX512VL-LABEL: shuffle_v16i16_16_00_17_01_18_02_19_11_24_08_25_09_26_10_27_11:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,16,1,17,2,18,3,27,8,24,9,25,10,26,11,27]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,16,1,17,2,18,3,27,8,24,9,25,10,26,11,27]
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 16, i32 0, i32 17, i32 1, i32 18, i32 2, i32 19, i32 11, i32 24, i32 8, i32 25, i32 9, i32 26, i32 10, i32 27, i32 11>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 16, i32 0, i32 17, i32 1, i32 18, i32 2, i32 19, i32 11, i32 24, i32 8, i32 25, i32 9, i32 26, i32 10, i32 27, i32 11>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_20_04_21_05_22_06_23_15_28_12_29_13_30_14_31_15(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_20_04_21_05_22_06_23_15_28_12_29_13_30_14_31_15(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_20_04_21_05_22_06_23_15_28_12_29_13_30_14_31_15:			; AVX1-LABEL: shuffle_v16i16_20_04_21_05_22_06_23_15_28_12_29_13_30_14_31_15:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	Show All 15 Lines
	; AVX2-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,4,6,7]			; AVX2-NEXT: vpshufhw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,4,6,7]
	; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm3, %ymm0, %ymm0
	; AVX2-NEXT: vpunpckhwd {{.*#+}} ymm1 = ymm1[4],ymm0[4],ymm1[5],ymm0[5],ymm1[6],ymm0[6],ymm1[7],ymm0[7],ymm1[12],ymm0[12],ymm1[13],ymm0[13],ymm1[14],ymm0[14],ymm1[15],ymm0[15]			; AVX2-NEXT: vpunpckhwd {{.*#+}} ymm1 = ymm1[4],ymm0[4],ymm1[5],ymm0[5],ymm1[6],ymm0[6],ymm1[7],ymm0[7],ymm1[12],ymm0[12],ymm1[13],ymm0[13],ymm1[14],ymm0[14],ymm1[15],ymm0[15]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm1[0],ymm0[1],ymm1[2],ymm0[3],ymm1[4],ymm0[5],ymm1[6],ymm0[7],ymm1[8],ymm0[9],ymm1[10],ymm0[11],ymm1[12],ymm0[13],ymm1[14],ymm0[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm1[0],ymm0[1],ymm1[2],ymm0[3],ymm1[4],ymm0[5],ymm1[6],ymm0[7],ymm1[8],ymm0[9],ymm1[10],ymm0[11],ymm1[12],ymm0[13],ymm1[14],ymm0[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_20_04_21_05_22_06_23_15_28_12_29_13_30_14_31_15:			; AVX512VL-LABEL: shuffle_v16i16_20_04_21_05_22_06_23_15_28_12_29_13_30_14_31_15:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [4,20,5,21,6,22,7,31,12,28,13,29,14,30,15,31]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [4,20,5,21,6,22,7,31,12,28,13,29,14,30,15,31]
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 20, i32 4, i32 21, i32 5, i32 22, i32 6, i32 23, i32 15, i32 28, i32 12, i32 29, i32 13, i32 30, i32 14, i32 31, i32 15>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 20, i32 4, i32 21, i32 5, i32 22, i32 6, i32 23, i32 15, i32 28, i32 12, i32 29, i32 13, i32 30, i32 14, i32 31, i32 15>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_02_01_03_20_22_21_31_08_10_09_11_28_30_29_31(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_02_01_03_20_22_21_31_08_10_09_11_28_30_29_31(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_02_01_03_20_22_21_31_08_10_09_11_28_30_29_31:			; AVX1-LABEL: shuffle_v16i16_00_02_01_03_20_22_21_31_08_10_09_11_28_30_29_31:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	Show All 19 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,5,6],xmm1[7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3,4,5,6],xmm1[7]
	; AVX2-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,1,3,4,5,6,7]			; AVX2-NEXT: vpshuflw {{.*#+}} xmm1 = xmm1[0,2,1,3,4,5,6,7]
	; AVX2-NEXT: vpshufhw {{.*#+}} xmm1 = xmm1[0,1,2,3,4,6,5,7]			; AVX2-NEXT: vpshufhw {{.*#+}} xmm1 = xmm1[0,1,2,3,4,6,5,7]
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_02_01_03_20_22_21_31_08_10_09_11_28_30_29_31:			; AVX512VL-LABEL: shuffle_v16i16_00_02_01_03_20_22_21_31_08_10_09_11_28_30_29_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,2,1,3,20,22,21,31,8,10,9,11,28,30,29,31]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,2,1,3,20,22,21,31,8,10,9,11,28,30,29,31]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 2, i32 1, i32 3, i32 20, i32 22, i32 21, i32 31, i32 8, i32 10, i32 9, i32 11, i32 28, i32 30, i32 29, i32 31>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 2, i32 1, i32 3, i32 20, i32 22, i32 21, i32 31, i32 8, i32 10, i32 9, i32 11, i32 28, i32 30, i32 29, i32 31>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_04_04_03_18_uu_uu_uu_uu_12_12_11_26_uu_uu_uu_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_04_04_03_18_uu_uu_uu_uu_12_12_11_26_uu_uu_uu_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_04_04_03_18_uu_uu_uu_uu_12_12_11_26_uu_uu_uu_uu:			; AVX1-LABEL: shuffle_v16i16_04_04_03_18_uu_uu_uu_uu_12_12_11_26_uu_uu_uu_uu:
	Show All 13 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0,1],ymm1[2],ymm0[3,4,5,6,7,8,9],ymm1[10],ymm0[11,12,13,14,15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0,1],ymm1[2],ymm0[3,4,5,6,7,8,9],ymm1[10],ymm0[11,12,13,14,15]
	; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[2,1,2,3,6,5,6,7]			; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[2,1,2,3,6,5,6,7]
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,3,2,4,5,6,7,8,8,11,10,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,3,2,4,5,6,7,8,8,11,10,12,13,14,15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_04_04_03_18_uu_uu_uu_uu_12_12_11_26_uu_uu_uu_uu:			; AVX512VL-LABEL: shuffle_v16i16_04_04_03_18_uu_uu_uu_uu_12_12_11_26_uu_uu_uu_uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = <4,4,3,18,u,u,u,u,12,12,11,26,u,u,u,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <4,4,3,18,u,u,u,u,12,12,11,26,u,u,u,u>
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 4, i32 3, i32 18, i32 undef, i32 undef, i32 undef, i32 undef, i32 12, i32 12, i32 11, i32 26, i32 undef, i32 undef, i32 undef, i32 undef>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 4, i32 4, i32 3, i32 18, i32 undef, i32 undef, i32 undef, i32 undef, i32 12, i32 12, i32 11, i32 26, i32 undef, i32 undef, i32 undef, i32 undef>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_03_02_21_uu_uu_uu_uu_08_11_10_29_uu_uu_uu_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_03_02_21_uu_uu_uu_uu_08_11_10_29_uu_uu_uu_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_03_02_21_uu_uu_uu_uu_08_11_10_29_uu_uu_uu_uu:			; AVX1-LABEL: shuffle_v16i16_00_03_02_21_uu_uu_uu_uu_08_11_10_29_uu_uu_uu_uu:
	Show All 11 Lines
	; AVX2-LABEL: shuffle_v16i16_00_03_02_21_uu_uu_uu_uu_08_11_10_29_uu_uu_uu_uu:			; AVX2-LABEL: shuffle_v16i16_00_03_02_21_uu_uu_uu_uu_08_11_10_29_uu_uu_uu_uu:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1],ymm1[2,3],ymm0[4,5],ymm1[6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1],ymm1[2,3],ymm0[4,5],ymm1[6,7]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,6,7,4,5,10,11,0,1,10,11,0,1,2,3,16,17,22,23,20,21,26,27,16,17,26,27,16,17,18,19]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,1,6,7,4,5,10,11,0,1,10,11,0,1,2,3,16,17,22,23,20,21,26,27,16,17,26,27,16,17,18,19]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_03_02_21_uu_uu_uu_uu_08_11_10_29_uu_uu_uu_uu:			; AVX512VL-LABEL: shuffle_v16i16_00_03_02_21_uu_uu_uu_uu_08_11_10_29_uu_uu_uu_uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = <0,3,2,21,u,u,u,u,8,11,10,29,u,u,u,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <0,3,2,21,u,u,u,u,8,11,10,29,u,u,u,u>
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 3, i32 2, i32 21, i32 undef, i32 undef, i32 undef, i32 undef, i32 8, i32 11, i32 10, i32 29, i32 undef, i32 undef, i32 undef, i32 undef>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 3, i32 2, i32 21, i32 undef, i32 undef, i32 undef, i32 undef, i32 8, i32 11, i32 10, i32 29, i32 undef, i32 undef, i32 undef, i32 undef>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_uu_uu_uu_21_uu_uu_uu_uu_uu_uu_uu_29_uu_uu_uu_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_uu_uu_uu_21_uu_uu_uu_uu_uu_uu_uu_29_uu_uu_uu_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_uu_uu_uu_21_uu_uu_uu_uu_uu_uu_uu_29_uu_uu_uu_uu:			; AVX1-LABEL: shuffle_v16i16_uu_uu_uu_21_uu_uu_uu_uu_uu_uu_uu_29_uu_uu_uu_uu:
	Show All 24 Lines
	; AVX2-LABEL: shuffle_v16i16_00_01_02_21_uu_uu_uu_uu_08_09_10_29_uu_uu_uu_uu:			; AVX2-LABEL: shuffle_v16i16_00_01_02_21_uu_uu_uu_uu_08_09_10_29_uu_uu_uu_uu:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[0,2,2,3,4,6,6,7]			; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[0,2,2,3,4,6,6,7]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0,1,2],ymm1[3],ymm0[4,5,6,7,8,9,10],ymm1[11],ymm0[12,13,14,15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0,1,2],ymm1[3],ymm0[4,5,6,7,8,9,10],ymm1[11],ymm0[12,13,14,15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_01_02_21_uu_uu_uu_uu_08_09_10_29_uu_uu_uu_uu:			; AVX512VL-LABEL: shuffle_v16i16_00_01_02_21_uu_uu_uu_uu_08_09_10_29_uu_uu_uu_uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = <0,1,2,21,u,u,u,u,8,9,10,29,u,u,u,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <0,1,2,21,u,u,u,u,8,9,10,29,u,u,u,u>
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 21, i32 undef, i32 undef, i32 undef, i32 undef, i32 8, i32 9, i32 10, i32 29, i32 undef, i32 undef, i32 undef, i32 undef>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 21, i32 undef, i32 undef, i32 undef, i32 undef, i32 8, i32 9, i32 10, i32 29, i32 undef, i32 undef, i32 undef, i32 undef>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_uu_uu_uu_uu_20_21_22_11_uu_uu_uu_uu_28_29_30_11(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_uu_uu_uu_uu_20_21_22_11_uu_uu_uu_uu_28_29_30_11(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_uu_uu_uu_uu_20_21_22_11_uu_uu_uu_uu_28_29_30_11:			; AVX1-LABEL: shuffle_v16i16_uu_uu_uu_uu_20_21_22_11_uu_uu_uu_uu_28_29_30_11:
	Show All 9 Lines
	; AVX2-LABEL: shuffle_v16i16_uu_uu_uu_uu_20_21_22_11_uu_uu_uu_uu_28_29_30_11:			; AVX2-LABEL: shuffle_v16i16_uu_uu_uu_uu_20_21_22_11_uu_uu_uu_uu_28_29_30_11:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,2,2]			; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,2,2]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm1[0,1,2,3,4,5,6],ymm0[7],ymm1[8,9,10,11,12,13,14],ymm0[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm1[0,1,2,3,4,5,6],ymm0[7],ymm1[8,9,10,11,12,13,14],ymm0[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_uu_uu_uu_uu_20_21_22_11_uu_uu_uu_uu_28_29_30_11:			; AVX512VL-LABEL: shuffle_v16i16_uu_uu_uu_uu_20_21_22_11_uu_uu_uu_uu_28_29_30_11:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = <u,u,u,u,4,5,6,27,u,u,u,u,12,13,14,27>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <u,u,u,u,4,5,6,27,u,u,u,u,12,13,14,27>
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 undef, i32 undef, i32 undef, i32 undef, i32 20, i32 21, i32 22, i32 11, i32 undef, i32 undef, i32 undef, i32 undef, i32 28, i32 29, i32 30, i32 11>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 undef, i32 undef, i32 undef, i32 undef, i32 20, i32 21, i32 22, i32 11, i32 undef, i32 undef, i32 undef, i32 undef, i32 28, i32 29, i32 30, i32 11>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_20_21_22_03_uu_uu_uu_uu_28_29_30_11_uu_uu_uu_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_20_21_22_03_uu_uu_uu_uu_28_29_30_11_uu_uu_uu_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_20_21_22_03_uu_uu_uu_uu_28_29_30_11_uu_uu_uu_uu:			; AVX1-LABEL: shuffle_v16i16_20_21_22_03_uu_uu_uu_uu_28_29_30_11_uu_uu_uu_uu:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	Show All 9 Lines
	; AVX2-LABEL: shuffle_v16i16_20_21_22_03_uu_uu_uu_uu_28_29_30_11_uu_uu_uu_uu:			; AVX2-LABEL: shuffle_v16i16_20_21_22_03_uu_uu_uu_uu_28_29_30_11_uu_uu_uu_uu:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[2,3,2,3,6,7,6,7]			; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[2,3,2,3,6,7,6,7]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm1[0,1,2],ymm0[3],ymm1[4,5,6,7,8,9,10],ymm0[11],ymm1[12,13,14,15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm1[0,1,2],ymm0[3],ymm1[4,5,6,7,8,9,10],ymm0[11],ymm1[12,13,14,15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_20_21_22_03_uu_uu_uu_uu_28_29_30_11_uu_uu_uu_uu:			; AVX512VL-LABEL: shuffle_v16i16_20_21_22_03_uu_uu_uu_uu_28_29_30_11_uu_uu_uu_uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = <4,5,6,19,u,u,u,u,12,13,14,27,u,u,u,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <4,5,6,19,u,u,u,u,12,13,14,27,u,u,u,u>
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 20, i32 21, i32 22, i32 3, i32 undef, i32 undef, i32 undef, i32 undef, i32 28, i32 29, i32 30, i32 11, i32 undef, i32 undef, i32 undef, i32 undef>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 20, i32 21, i32 22, i32 3, i32 undef, i32 undef, i32 undef, i32 undef, i32 28, i32 29, i32 30, i32 11, i32 undef, i32 undef, i32 undef, i32 undef>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_01_02_21_20_21_22_11_08_09_10_29_28_29_30_11(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_01_02_21_20_21_22_11_08_09_10_29_28_29_30_11(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_01_02_21_20_21_22_11_08_09_10_29_28_29_30_11:			; AVX1-LABEL: shuffle_v16i16_00_01_02_21_20_21_22_11_08_09_10_29_28_29_30_11:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	Show All 15 Lines
	; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = [0,1,2,3,4,5,10,11,8,9,10,11,12,13,6,7]			; AVX2-NEXT: vmovdqa {{.*#+}} xmm2 = [0,1,2,3,4,5,10,11,8,9,10,11,12,13,6,7]
	; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1			; AVX2-NEXT: vpshufb %xmm2, %xmm1, %xmm1
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_01_02_21_20_21_22_11_08_09_10_29_28_29_30_11:			; AVX512VL-LABEL: shuffle_v16i16_00_01_02_21_20_21_22_11_08_09_10_29_28_29_30_11:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,1,2,21,20,21,22,11,8,9,10,29,28,29,30,11]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,1,2,21,20,21,22,11,8,9,10,29,28,29,30,11]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 21, i32 20, i32 21, i32 22, i32 11, i32 8, i32 9, i32 10, i32 29, i32 28, i32 29, i32 30, i32 11>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 1, i32 2, i32 21, i32 20, i32 21, i32 22, i32 11, i32 8, i32 9, i32 10, i32 29, i32 28, i32 29, i32 30, i32 11>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_00_17_02_03_20_21_22_15_08_25_10_11_28_29_30_15(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_00_17_02_03_20_21_22_15_08_25_10_11_28_29_30_15(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_00_17_02_03_20_21_22_15_08_25_10_11_28_29_30_15:			; AVX1-LABEL: shuffle_v16i16_00_17_02_03_20_21_22_15_08_25_10_11_28_29_30_15:
	Show All 9 Lines
	; AVX2-LABEL: shuffle_v16i16_00_17_02_03_20_21_22_15_08_25_10_11_28_29_30_15:			; AVX2-LABEL: shuffle_v16i16_00_17_02_03_20_21_22_15_08_25_10_11_28_29_30_15:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,3,2,3]			; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,3,2,3]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2,3],ymm1[4,5,6],ymm0[7,8],ymm1[9],ymm0[10,11],ymm1[12,13,14],ymm0[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2,3],ymm1[4,5,6],ymm0[7,8],ymm1[9],ymm0[10,11],ymm1[12,13,14],ymm0[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_00_17_02_03_20_21_22_15_08_25_10_11_28_29_30_15:			; AVX512VL-LABEL: shuffle_v16i16_00_17_02_03_20_21_22_15_08_25_10_11_28_29_30_15:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,17,2,3,20,21,22,15,8,25,10,11,28,29,30,15]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,17,2,3,20,21,22,15,8,25,10,11,28,29,30,15]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 17, i32 2, i32 3, i32 20, i32 21, i32 22, i32 15, i32 8, i32 25, i32 10, i32 11, i32 28, i32 29, i32 30, i32 15>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 0, i32 17, i32 2, i32 3, i32 20, i32 21, i32 22, i32 15, i32 8, i32 25, i32 10, i32 11, i32 28, i32 29, i32 30, i32 15>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_uu_uu_uu_01_uu_05_07_25_uu_uu_uu_09_uu_13_15_25(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_uu_uu_uu_01_uu_05_07_25_uu_uu_uu_09_uu_13_15_25(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_uu_uu_uu_01_uu_05_07_25_uu_uu_uu_09_uu_13_15_25:			; AVX1-LABEL: shuffle_v16i16_uu_uu_uu_01_uu_05_07_25_uu_uu_uu_09_uu_13_15_25:
	Show All 16 Lines
	; AVX2-NEXT: vpbroadcastd %xmm1, %ymm1			; AVX2-NEXT: vpbroadcastd %xmm1, %ymm1
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,1,2,1,4,5,6,7,8,9,10,9,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,1,2,1,4,5,6,7,8,9,10,9,12,13,14,15]
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,5,7,7,8,9,10,11,12,13,15,15]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,5,7,7,8,9,10,11,12,13,15,15]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,5,6],ymm1[7],ymm0[8,9,10,11,12,13,14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0,1,2,3,4,5,6],ymm1[7],ymm0[8,9,10,11,12,13,14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_uu_uu_uu_01_uu_05_07_25_uu_uu_uu_09_uu_13_15_25:			; AVX512VL-LABEL: shuffle_v16i16_uu_uu_uu_01_uu_05_07_25_uu_uu_uu_09_uu_13_15_25:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = <u,u,u,1,u,5,7,25,u,u,u,9,u,13,15,25>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <u,u,u,1,u,5,7,25,u,u,u,9,u,13,15,25>
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 undef, i32 undef, i32 undef, i32 1, i32 undef, i32 5, i32 7, i32 25, i32 undef, i32 undef, i32 undef, i32 9, i32 undef, i32 13, i32 15, i32 25>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 undef, i32 undef, i32 undef, i32 1, i32 undef, i32 5, i32 7, i32 25, i32 undef, i32 undef, i32 undef, i32 9, i32 undef, i32 13, i32 15, i32 25>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_uu_uu_04_uu_16_18_20_uu_uu_uu_12_uu_24_26_28_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_uu_uu_04_uu_16_18_20_uu_uu_uu_12_uu_24_26_28_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_uu_uu_04_uu_16_18_20_uu_uu_uu_12_uu_24_26_28_uu:			; AVX1-LABEL: shuffle_v16i16_uu_uu_04_uu_16_18_20_uu_uu_uu_12_uu_24_26_28_uu:
	Show All 14 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[0,1,4,5,4,5,6,7,0,1,4,5,8,9,4,5,16,17,20,21,20,21,22,23,16,17,20,21,24,25,20,21]			; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[0,1,4,5,4,5,6,7,0,1,4,5,8,9,4,5,16,17,20,21,20,21,22,23,16,17,20,21,24,25,20,21]
	; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,2,2,3,4,6,6,7]			; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,2,2,3,4,6,6,7]
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1],ymm1[2,3],ymm0[4,5],ymm1[6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1],ymm1[2,3],ymm0[4,5],ymm1[6,7]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_uu_uu_04_uu_16_18_20_uu_uu_uu_12_uu_24_26_28_uu:			; AVX512VL-LABEL: shuffle_v16i16_uu_uu_04_uu_16_18_20_uu_uu_uu_12_uu_24_26_28_uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = <u,u,20,u,0,2,4,u,u,u,28,u,8,10,12,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <u,u,20,u,0,2,4,u,u,u,28,u,8,10,12,u>
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 undef, i32 undef, i32 4, i32 undef, i32 16, i32 18, i32 20, i32 undef, i32 undef, i32 undef, i32 12, i32 undef, i32 24, i32 26, i32 28, i32 undef>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 undef, i32 undef, i32 4, i32 undef, i32 16, i32 18, i32 20, i32 undef, i32 undef, i32 undef, i32 12, i32 undef, i32 24, i32 26, i32 28, i32 undef>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_21_22_23_00_01_02_03_12_29_30_31_08_09_10_11_12(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_21_22_23_00_01_02_03_12_29_30_31_08_09_10_11_12(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_21_22_23_00_01_02_03_12_29_30_31_08_09_10_11_12:			; AVX1-LABEL: shuffle_v16i16_21_22_23_00_01_02_03_12_29_30_31_08_09_10_11_12:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	Show All 12 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3],xmm1[4],xmm0[5,6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3],xmm1[4],xmm0[5,6,7]
	; AVX2-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9]			; AVX2-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9]
	; AVX2-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9]			; AVX2-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9]
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_21_22_23_00_01_02_03_12_29_30_31_08_09_10_11_12:			; AVX512VL-LABEL: shuffle_v16i16_21_22_23_00_01_02_03_12_29_30_31_08_09_10_11_12:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [21,22,23,0,1,2,3,12,29,30,31,8,9,10,11,12]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [21,22,23,0,1,2,3,12,29,30,31,8,9,10,11,12]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 21, i32 22, i32 23, i32 0, i32 1, i32 2, i32 3, i32 12, i32 29, i32 30, i32 31, i32 8, i32 9, i32 10, i32 11, i32 12>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 21, i32 22, i32 23, i32 0, i32 1, i32 2, i32 3, i32 12, i32 29, i32 30, i32 31, i32 8, i32 9, i32 10, i32 11, i32 12>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_uu_22_uu_uu_01_02_03_uu_uu_30_uu_uu_09_10_11_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_uu_22_uu_uu_01_02_03_uu_uu_30_uu_uu_09_10_11_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_uu_22_uu_uu_01_02_03_uu_uu_30_uu_uu_09_10_11_uu:			; AVX1-LABEL: shuffle_v16i16_uu_22_uu_uu_01_02_03_uu_uu_30_uu_uu_09_10_11_uu:
	▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3,4,5,6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3,4,5,6,7]
	; AVX2-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5]			; AVX2-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5]
	; AVX2-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5]			; AVX2-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5]
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_19_20_21_22_23_00_01_10_27_28_29_30_31_08_09_10:			; AVX512VL-LABEL: shuffle_v16i16_19_20_21_22_23_00_01_10_27_28_29_30_31_08_09_10:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [3,4,5,6,7,16,17,26,11,12,13,14,15,24,25,26]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [3,4,5,6,7,16,17,26,11,12,13,14,15,24,25,26]
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 19, i32 20, i32 21, i32 22, i32 23, i32 0, i32 1, i32 10, i32 27, i32 28, i32 29, i32 30, i32 31, i32 8, i32 9, i32 10>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 19, i32 20, i32 21, i32 22, i32 23, i32 0, i32 1, i32 10, i32 27, i32 28, i32 29, i32 30, i32 31, i32 8, i32 9, i32 10>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_uu_20_21_22_uu_uu_01_uu_uu_28_29_30_uu_uu_09_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_uu_20_21_22_uu_uu_01_uu_uu_28_29_30_uu_uu_09_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_uu_20_21_22_uu_uu_01_uu_uu_28_29_30_uu_uu_09_uu:			; AVX1-LABEL: shuffle_v16i16_uu_20_21_22_uu_uu_01_uu_uu_28_29_30_uu_uu_09_uu:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3,4,5,6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1],xmm1[2],xmm0[3,4,5,6,7]
	; AVX2-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5]			; AVX2-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5]
	; AVX2-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5]			; AVX2-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5]
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_03_04_05_06_07_16_17_26_11_12_13_14_15_24_25_26:			; AVX512VL-LABEL: shuffle_v16i16_03_04_05_06_07_16_17_26_11_12_13_14_15_24_25_26:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [3,4,5,6,7,16,17,26,11,12,13,14,15,24,25,26]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [3,4,5,6,7,16,17,26,11,12,13,14,15,24,25,26]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 26, i32 11, i32 12, i32 13, i32 14, i32 15, i32 24, i32 25, i32 26>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 3, i32 4, i32 5, i32 6, i32 7, i32 16, i32 17, i32 26, i32 11, i32 12, i32 13, i32 14, i32 15, i32 24, i32 25, i32 26>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_uu_04_05_06_uu_uu_17_uu_uu_12_13_14_uu_uu_25_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_uu_04_05_06_uu_uu_17_uu_uu_12_13_14_uu_uu_25_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_uu_04_05_06_uu_uu_17_uu_uu_12_13_14_uu_uu_25_uu:			; AVX1-LABEL: shuffle_v16i16_uu_04_05_06_uu_uu_17_uu_uu_12_13_14_uu_uu_25_uu:
	Show All 31 Lines
	; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3],xmm1[4],xmm0[5,6,7]			; AVX2-NEXT: vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3],xmm1[4],xmm0[5,6,7]
	; AVX2-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9]			; AVX2-NEXT: vpalignr {{.*#+}} xmm0 = xmm0[10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9]
	; AVX2-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9]			; AVX2-NEXT: vpalignr {{.*#+}} xmm1 = xmm1[10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9]
	; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0			; AVX2-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_05_06_07_16_17_18_19_28_13_14_15_24_25_26_27_28:			; AVX512VL-LABEL: shuffle_v16i16_05_06_07_16_17_18_19_28_13_14_15_24_25_26_27_28:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [21,22,23,0,1,2,3,12,29,30,31,8,9,10,11,12]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [21,22,23,0,1,2,3,12,29,30,31,8,9,10,11,12]
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 28, i32 13, i32 14, i32 15, i32 24, i32 25, i32 26, i32 27, i32 28>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 5, i32 6, i32 7, i32 16, i32 17, i32 18, i32 19, i32 28, i32 13, i32 14, i32 15, i32 24, i32 25, i32 26, i32 27, i32 28>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_uu_06_uu_uu_17_18_19_uu_uu_14_uu_uu_25_26_27_uu(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_uu_06_uu_uu_17_18_19_uu_uu_14_uu_uu_25_26_27_uu(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_uu_06_uu_uu_17_18_19_uu_uu_14_uu_uu_25_26_27_uu:			; AVX1-LABEL: shuffle_v16i16_uu_06_uu_uu_17_18_19_uu_uu_14_uu_uu_25_26_27_uu:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	Show All 29 Lines
	; AVX2-LABEL: shuffle_v16i16_23_uu_03_uu_20_20_05_uu_31_uu_11_uu_28_28_13_uu:			; AVX2-LABEL: shuffle_v16i16_23_uu_03_uu_20_20_05_uu_31_uu_11_uu_28_28_13_uu:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0,1,2,3],ymm1[4],ymm0[5,6],ymm1[7],ymm0[8,9,10,11],ymm1[12],ymm0[13,14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0,1,2,3],ymm1[4],ymm0[5,6],ymm1[7],ymm0[8,9,10,11],ymm1[12],ymm0[13,14],ymm1[15]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[14,15,14,15,6,7,6,7,8,9,8,9,10,11,14,15,30,31,30,31,22,23,22,23,24,25,24,25,26,27,30,31]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[14,15,14,15,6,7,6,7,8,9,8,9,10,11,14,15,30,31,30,31,22,23,22,23,24,25,24,25,26,27,30,31]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_23_uu_03_uu_20_20_05_uu_31_uu_11_uu_28_28_13_uu:			; AVX512VL-LABEL: shuffle_v16i16_23_uu_03_uu_20_20_05_uu_31_uu_11_uu_28_28_13_uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = <7,u,19,u,4,4,21,u,15,u,27,u,12,12,29,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <7,u,19,u,4,4,21,u,15,u,27,u,12,12,29,u>
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 23, i32 undef, i32 3, i32 undef, i32 20, i32 20, i32 5, i32 undef, i32 31, i32 undef, i32 11, i32 undef, i32 28, i32 28, i32 13, i32 undef>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 23, i32 undef, i32 3, i32 undef, i32 20, i32 20, i32 5, i32 undef, i32 31, i32 undef, i32 11, i32 undef, i32 28, i32 28, i32 13, i32 undef>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @shuffle_v16i16_u_u_u_u_u_u_u_u_0_16_1_17_2_18_3_19(<16 x i16> %a, <16 x i16> %b) {			define <16 x i16> @shuffle_v16i16_u_u_u_u_u_u_u_u_0_16_1_17_2_18_3_19(<16 x i16> %a, <16 x i16> %b) {
	; AVX1-LABEL: shuffle_v16i16_u_u_u_u_u_u_u_u_0_16_1_17_2_18_3_19:			; AVX1-LABEL: shuffle_v16i16_u_u_u_u_u_u_u_u_0_16_1_17_2_18_3_19:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[0,1,4,5,4,5,6,7,0,1,0,1,12,13,2,3,16,17,20,21,20,21,22,23,16,17,16,17,28,29,18,19]			; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[0,1,4,5,4,5,6,7,0,1,0,1,12,13,2,3,16,17,20,21,20,21,22,23,16,17,16,17,28,29,18,19]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[4,5,2,3,6,7,6,7,0,1,2,3,2,3,14,15,20,21,18,19,22,23,22,23,16,17,18,19,18,19,30,31]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[4,5,2,3,6,7,6,7,0,1,2,3,2,3,14,15,20,21,18,19,22,23,22,23,16,17,18,19,18,19,30,31]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_02_18_03_19_00_16_01_17_10_26_11_27_08_24_09_25:			; AVX512VL-LABEL: shuffle_v16i16_02_18_03_19_00_16_01_17_10_26_11_27_08_24_09_25:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [2,18,3,19,0,16,1,17,10,26,11,27,8,24,9,25]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [2,18,3,19,0,16,1,17,10,26,11,27,8,24,9,25]
	; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%1 = shufflevector <16 x i16> %a0, <16 x i16> %a1, <16 x i32> <i32 2, i32 18, i32 3, i32 19, i32 0, i32 16, i32 1, i32 17, i32 10, i32 26, i32 11, i32 27, i32 8, i32 24, i32 9, i32 25>			%1 = shufflevector <16 x i16> %a0, <16 x i16> %a1, <16 x i32> <i32 2, i32 18, i32 3, i32 19, i32 0, i32 16, i32 1, i32 17, i32 10, i32 26, i32 11, i32 27, i32 8, i32 24, i32 9, i32 25>
	ret <16 x i16> %1			ret <16 x i16> %1
	}			}

	define <16 x i16> @shuffle_v16i16_02_18_03_19_10_26_11_27_00_16_01_17_08_24_09_25(<16 x i16> %a0, <16 x i16> %a1) {			define <16 x i16> @shuffle_v16i16_02_18_03_19_10_26_11_27_00_16_01_17_08_24_09_25(<16 x i16> %a0, <16 x i16> %a1) {
	; AVX1-LABEL: shuffle_v16i16_02_18_03_19_10_26_11_27_00_16_01_17_08_24_09_25:			; AVX1-LABEL: shuffle_v16i16_02_18_03_19_10_26_11_27_00_16_01_17_08_24_09_25:
	Show All 20 Lines
	; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[0,1,4,5,4,5,6,7,0,1,0,1,12,13,2,3,16,17,20,21,20,21,22,23,16,17,16,17,28,29,18,19]			; AVX2-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[0,1,4,5,4,5,6,7,0,1,0,1,12,13,2,3,16,17,20,21,20,21,22,23,16,17,16,17,28,29,18,19]
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[4,5,2,3,6,7,6,7,0,1,2,3,2,3,14,15,20,21,18,19,22,23,22,23,16,17,18,19,18,19,30,31]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[4,5,2,3,6,7,6,7,0,1,2,3,2,3,14,15,20,21,18,19,22,23,22,23,16,17,18,19,18,19,30,31]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7],ymm0[8],ymm1[9],ymm0[10],ymm1[11],ymm0[12],ymm1[13],ymm0[14],ymm1[15]
	; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3]			; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,1,3]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v16i16_02_18_03_19_10_26_11_27_00_16_01_17_08_24_09_25:			; AVX512VL-LABEL: shuffle_v16i16_02_18_03_19_10_26_11_27_00_16_01_17_08_24_09_25:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [2,18,3,19,0,16,1,17,10,26,11,27,8,24,9,25]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [2,18,3,19,0,16,1,17,10,26,11,27,8,24,9,25]
	; AVX512VL-NEXT: vpermi2w %ymm1, %ymm0, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm1, %ymm0, %ymm2
	; AVX512VL-NEXT: vpermq {{.*#+}} ymm0 = ymm2[0,2,1,3]			; AVX512VL-NEXT: vpermq {{.*#+}} ymm0 = ymm2[0,2,1,3]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%1 = shufflevector <16 x i16> %a0, <16 x i16> %a1, <16 x i32> <i32 2, i32 18, i32 3, i32 19, i32 0, i32 16, i32 1, i32 17, i32 10, i32 26, i32 11, i32 27, i32 8, i32 24, i32 9, i32 25>			%1 = shufflevector <16 x i16> %a0, <16 x i16> %a1, <16 x i32> <i32 2, i32 18, i32 3, i32 19, i32 0, i32 16, i32 1, i32 17, i32 10, i32 26, i32 11, i32 27, i32 8, i32 24, i32 9, i32 25>
	%2 = bitcast <16 x i16> %1 to <4 x i64>			%2 = bitcast <16 x i16> %1 to <4 x i64>
	%3 = shufflevector <4 x i64> %2, <4 x i64> undef, <4 x i32> <i32 0, i32 2, i32 1, i32 3>			%3 = shufflevector <4 x i64> %2, <4 x i64> undef, <4 x i32> <i32 0, i32 2, i32 1, i32 3>
	%4 = bitcast <4 x i64> %3 to <16 x i16>			%4 = bitcast <4 x i64> %3 to <16 x i16>
	ret <16 x i16> %4			ret <16 x i16> %4
	▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,5,5,6,7,8,9,10,11,13,13,14,15]			; AVX2-NEXT: vpshufhw {{.*#+}} ymm0 = ymm0[0,1,2,3,5,5,6,7,8,9,10,11,13,13,14,15]
	; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm2[0],ymm0[1,2],ymm2[3],ymm0[4],ymm2[5,6,7,8],ymm0[9,10],ymm2[11],ymm0[12],ymm2[13,14,15]			; AVX2-NEXT: vpblendw {{.*#+}} ymm0 = ymm2[0],ymm0[1,2],ymm2[3],ymm0[4],ymm2[5,6,7,8],ymm0[9,10],ymm2[11],ymm0[12],ymm2[13,14,15]
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,255,255,0,0,255,255,255,255,255,255,0,0,255,255,0,0,0,0,255,255,255,255,0,0,0,0,0,0,255,255]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,255,255,255,0,0,255,255,255,255,255,255,0,0,255,255,0,0,0,0,255,255,255,255,0,0,0,0,0,0,255,255]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: PR24935:			; AVX512VL-LABEL: PR24935:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu16 {{.*#+}} ymm2 = [11,10,17,13,10,7,27,0,17,25,0,12,29,20,16,8]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [11,10,17,13,10,7,27,0,17,25,0,12,29,20,16,8]
	; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2w %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 27, i32 26, i32 1, i32 29, i32 26, i32 23, i32 11, i32 16, i32 1, i32 9, i32 16, i32 28, i32 13, i32 4, i32 0, i32 24>			%shuffle = shufflevector <16 x i16> %a, <16 x i16> %b, <16 x i32> <i32 27, i32 26, i32 1, i32 29, i32 26, i32 23, i32 11, i32 16, i32 1, i32 9, i32 16, i32 28, i32 13, i32 4, i32 0, i32 24>
	ret <16 x i16> %shuffle			ret <16 x i16> %shuffle
	}			}

	define <16 x i16> @insert_dup_mem_v16i16_i32(i32* %ptr) {			define <16 x i16> @insert_dup_mem_v16i16_i32(i32* %ptr) {
	; AVX1-LABEL: insert_dup_mem_v16i16_i32:			; AVX1-LABEL: insert_dup_mem_v16i16_i32:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v32.ll

	Show First 20 Lines • Show All 311 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpbroadcastb %xmm0, %xmm0			; AVX2-NEXT: vpbroadcastb %xmm0, %xmm0
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_16_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_16_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX512VL-NEXT: vpxord %ymm2, %ymm2, %ymm2			; AVX512VL-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; AVX512VL-NEXT: vpshufb %ymm2, %ymm1, %ymm1			; AVX512VL-NEXT: vpshufb %ymm2, %ymm1, %ymm1
	; AVX512VL-NEXT: vpbroadcastb %xmm0, %xmm0			; AVX512VL-NEXT: vpbroadcastb %xmm0, %xmm0
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 16, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 16, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_17_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_17_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<32 x i8> %a, <32 x i8> %b) {
	; AVX1-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_17_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX1-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_17_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	Show All 12 Lines
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = <0,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = <0,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u>
	; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_17_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_17_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = <0,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <0,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u>
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 17, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 17, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_18_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_18_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<32 x i8> %a, <32 x i8> %b) {
	Show All 13 Lines
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>
	; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_18_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_18_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 18, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 18, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_19_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_19_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<32 x i8> %a, <32 x i8> %b) {
	Show All 13 Lines
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>
	; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]			; AVX2-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_19_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:			; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_19_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = <0,0,255,255,u,u,u,u,u,u,u,u,u,u,u,u,255,255,u,u,u,u,u,u,u,u,u,u,u,u,u,u>
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 19, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 19, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_20_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_20_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00(<32 x i8> %a, <32 x i8> %b) {
	▲ Show 20 Lines • Show All 323 Lines • ▼ Show 20 Lines
	; AVX2-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_16_16_16_16_16_16_16_16_16_16_16_16_16_16_16_16:			; AVX2-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_16_16_16_16_16_16_16_16_16_16_16_16_16_16_16_16:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpxor %ymm1, %ymm1, %ymm1			; AVX2-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX2-NEXT: vpshufb %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpshufb %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_16_16_16_16_16_16_16_16_16_16_16_16_16_16_16_16:			; AVX512VL-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_00_16_16_16_16_16_16_16_16_16_16_16_16_16_16_16_16:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512VL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512VL-NEXT: vpshufb %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpshufb %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_15_15_15_15_15_15_15_15_15_15_15_15_15_15_15_15_31_31_31_31_31_31_31_31_31_31_31_31_31_31_31_31(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_15_15_15_15_15_15_15_15_15_15_15_15_15_15_15_15_31_31_31_31_31_31_31_31_31_31_31_31_31_31_31_31(<32 x i8> %a, <32 x i8> %b) {
	; AVX1-LABEL: shuffle_v32i8_15_15_15_15_15_15_15_15_15_15_15_15_15_15_15_15_31_31_31_31_31_31_31_31_31_31_31_31_31_31_31_31:			; AVX1-LABEL: shuffle_v32i8_15_15_15_15_15_15_15_15_15_15_15_15_15_15_15_15_31_31_31_31_31_31_31_31_31_31_31_31_31_31_31_31:
	▲ Show 20 Lines • Show All 271 Lines • ▼ Show 20 Lines
	; AVX2-LABEL: shuffle_v32i8_00_33_02_35_04_37_06_39_08_41_10_43_12_45_14_47_16_49_18_51_20_53_22_55_24_57_26_59_28_61_30_63:			; AVX2-LABEL: shuffle_v32i8_00_33_02_35_04_37_06_39_08_41_10_43_12_45_14_47_16_49_18_51_20_53_22_55_24_57_26_59_28_61_30_63:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_00_33_02_35_04_37_06_39_08_41_10_43_12_45_14_47_16_49_18_51_20_53_22_55_24_57_26_59_28_61_30_63:			; AVX512VL-LABEL: shuffle_v32i8_00_33_02_35_04_37_06_39_08_41_10_43_12_45_14_47_16_49_18_51_20_53_22_55_24_57_26_59_28_61_30_63:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 33, i32 2, i32 35, i32 4, i32 37, i32 6, i32 39, i32 8, i32 41, i32 10, i32 43, i32 12, i32 45, i32 14, i32 47, i32 16, i32 49, i32 18, i32 51, i32 20, i32 53, i32 22, i32 55, i32 24, i32 57, i32 26, i32 59, i32 28, i32 61, i32 30, i32 63>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 33, i32 2, i32 35, i32 4, i32 37, i32 6, i32 39, i32 8, i32 41, i32 10, i32 43, i32 12, i32 45, i32 14, i32 47, i32 16, i32 49, i32 18, i32 51, i32 20, i32 53, i32 22, i32 55, i32 24, i32 57, i32 26, i32 59, i32 28, i32 61, i32 30, i32 63>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_32_01_34_03_36_05_38_07_40_09_42_11_44_13_46_15_48_17_50_19_52_21_54_23_56_25_58_27_60_29_62_31(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_32_01_34_03_36_05_38_07_40_09_42_11_44_13_46_15_48_17_50_19_52_21_54_23_56_25_58_27_60_29_62_31(<32 x i8> %a, <32 x i8> %b) {
	; AVX1-LABEL: shuffle_v32i8_32_01_34_03_36_05_38_07_40_09_42_11_44_13_46_15_48_17_50_19_52_21_54_23_56_25_58_27_60_29_62_31:			; AVX1-LABEL: shuffle_v32i8_32_01_34_03_36_05_38_07_40_09_42_11_44_13_46_15_48_17_50_19_52_21_54_23_56_25_58_27_60_29_62_31:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vmovaps {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX1-NEXT: vmovaps {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX1-NEXT: vandnps %ymm0, %ymm2, %ymm0			; AVX1-NEXT: vandnps %ymm0, %ymm2, %ymm0
	; AVX1-NEXT: vandps %ymm2, %ymm1, %ymm1			; AVX1-NEXT: vandps %ymm2, %ymm1, %ymm1
	; AVX1-NEXT: vorps %ymm0, %ymm1, %ymm0			; AVX1-NEXT: vorps %ymm0, %ymm1, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v32i8_32_01_34_03_36_05_38_07_40_09_42_11_44_13_46_15_48_17_50_19_52_21_54_23_56_25_58_27_60_29_62_31:			; AVX2-LABEL: shuffle_v32i8_32_01_34_03_36_05_38_07_40_09_42_11_44_13_46_15_48_17_50_19_52_21_54_23_56_25_58_27_60_29_62_31:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_32_01_34_03_36_05_38_07_40_09_42_11_44_13_46_15_48_17_50_19_52_21_54_23_56_25_58_27_60_29_62_31:			; AVX512VL-LABEL: shuffle_v32i8_32_01_34_03_36_05_38_07_40_09_42_11_44_13_46_15_48_17_50_19_52_21_54_23_56_25_58_27_60_29_62_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 32, i32 1, i32 34, i32 3, i32 36, i32 5, i32 38, i32 7, i32 40, i32 9, i32 42, i32 11, i32 44, i32 13, i32 46, i32 15, i32 48, i32 17, i32 50, i32 19, i32 52, i32 21, i32 54, i32 23, i32 56, i32 25, i32 58, i32 27, i32 60, i32 29, i32 62, i32 31>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 32, i32 1, i32 34, i32 3, i32 36, i32 5, i32 38, i32 7, i32 40, i32 9, i32 42, i32 11, i32 44, i32 13, i32 46, i32 15, i32 48, i32 17, i32 50, i32 19, i32 52, i32 21, i32 54, i32 23, i32 56, i32 25, i32 58, i32 27, i32 60, i32 29, i32 62, i32 31>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_zz_01_zz_03_zz_05_zz_07_zz_09_zz_11_zz_13_zz_15_zz_17_zz_19_zz_21_zz_23_zz_25_zz_27_zz_29_zz_31(<32 x i8> %a) {			define <32 x i8> @shuffle_v32i8_zz_01_zz_03_zz_05_zz_07_zz_09_zz_11_zz_13_zz_15_zz_17_zz_19_zz_21_zz_23_zz_25_zz_27_zz_29_zz_31(<32 x i8> %a) {
	; AVX1OR2-LABEL: shuffle_v32i8_zz_01_zz_03_zz_05_zz_07_zz_09_zz_11_zz_13_zz_15_zz_17_zz_19_zz_21_zz_23_zz_25_zz_27_zz_29_zz_31:			; AVX1OR2-LABEL: shuffle_v32i8_zz_01_zz_03_zz_05_zz_07_zz_09_zz_11_zz_13_zz_15_zz_17_zz_19_zz_21_zz_23_zz_25_zz_27_zz_29_zz_31:
	; AVX1OR2: # BB#0:			; AVX1OR2: # BB#0:
	; AVX1OR2-NEXT: vandps {{.*}}(%rip), %ymm0, %ymm0			; AVX1OR2-NEXT: vandps {{.*}}(%rip), %ymm0, %ymm0
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_zz_01_zz_03_zz_05_zz_07_zz_09_zz_11_zz_13_zz_15_zz_17_zz_19_zz_21_zz_23_zz_25_zz_27_zz_29_zz_31:			; AVX512VL-LABEL: shuffle_v32i8_zz_01_zz_03_zz_05_zz_07_zz_09_zz_11_zz_13_zz_15_zz_17_zz_19_zz_21_zz_23_zz_25_zz_27_zz_29_zz_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpandq {{.*}}(%rip), %ymm0, %ymm0			; AVX512VL-NEXT: vpand {{.*}}(%rip), %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> zeroinitializer, <32 x i32> <i32 32, i32 1, i32 34, i32 3, i32 36, i32 5, i32 38, i32 7, i32 40, i32 9, i32 42, i32 11, i32 44, i32 13, i32 46, i32 15, i32 48, i32 17, i32 50, i32 19, i32 52, i32 21, i32 54, i32 23, i32 56, i32 25, i32 58, i32 27, i32 60, i32 29, i32 62, i32 31>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> zeroinitializer, <32 x i32> <i32 32, i32 1, i32 34, i32 3, i32 36, i32 5, i32 38, i32 7, i32 40, i32 9, i32 42, i32 11, i32 44, i32 13, i32 46, i32 15, i32 48, i32 17, i32 50, i32 19, i32 52, i32 21, i32 54, i32 23, i32 56, i32 25, i32 58, i32 27, i32 60, i32 29, i32 62, i32 31>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_01_zz_02_zz_04_uu_06_07_08_09_10_11_12_13_14_15_u6_17_18_19_20_21_22_23_24_25_26_27_28_29_30_31(<32 x i8> %a) {			define <32 x i8> @shuffle_v32i8_01_zz_02_zz_04_uu_06_07_08_09_10_11_12_13_14_15_u6_17_18_19_20_21_22_23_24_25_26_27_28_29_30_31(<32 x i8> %a) {
	; AVX1-LABEL: shuffle_v32i8_01_zz_02_zz_04_uu_06_07_08_09_10_11_12_13_14_15_u6_17_18_19_20_21_22_23_24_25_26_27_28_29_30_31:			; AVX1-LABEL: shuffle_v32i8_01_zz_02_zz_04_uu_06_07_08_09_10_11_12_13_14_15_u6_17_18_19_20_21_22_23_24_25_26_27_28_29_30_31:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX2-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,0,1,1,4,4,5,5]			; AVX2-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,0,1,1,4,4,5,5]
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_00_32_00_32_00_32_00_32_00_32_00_32_00_32_00_32_16_48_16_48_16_48_16_48_16_48_16_48_16_48_16_48:			; AVX512VL-LABEL: shuffle_v32i8_00_32_00_32_00_32_00_32_00_32_00_32_00_32_00_32_16_48_16_48_16_48_16_48_16_48_16_48_16_48_16_48:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %ymm2, %ymm2, %ymm2			; AVX512VL-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; AVX512VL-NEXT: vpshufb %ymm2, %ymm1, %ymm1			; AVX512VL-NEXT: vpshufb %ymm2, %ymm1, %ymm1
	; AVX512VL-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]			; AVX512VL-NEXT: vpshuflw {{.*#+}} ymm0 = ymm0[0,0,0,0,4,5,6,7,8,8,8,8,12,13,14,15]
	; AVX512VL-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,0,1,1,4,4,5,5]			; AVX512VL-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,0,1,1,4,4,5,5]
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 0, i32 32, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48, i32 16, i32 48>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_32_32_32_32_32_32_32_32_08_09_10_11_12_13_14_15_48_48_48_48_48_48_48_48_24_25_26_27_28_29_30_31(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_32_32_32_32_32_32_32_32_08_09_10_11_12_13_14_15_48_48_48_48_48_48_48_48_24_25_26_27_28_29_30_31(<32 x i8> %a, <32 x i8> %b) {
	; AVX1-LABEL: shuffle_v32i8_32_32_32_32_32_32_32_32_08_09_10_11_12_13_14_15_48_48_48_48_48_48_48_48_24_25_26_27_28_29_30_31:			; AVX1-LABEL: shuffle_v32i8_32_32_32_32_32_32_32_32_08_09_10_11_12_13_14_15_48_48_48_48_48_48_48_48_24_25_26_27_28_29_30_31:
	Show All 13 Lines
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpxor %ymm2, %ymm2, %ymm2			; AVX2-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; AVX2-NEXT: vpshufb %ymm2, %ymm1, %ymm1			; AVX2-NEXT: vpshufb %ymm2, %ymm1, %ymm1
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_32_32_32_32_32_32_32_32_08_09_10_11_12_13_14_15_48_48_48_48_48_48_48_48_24_25_26_27_28_29_30_31:			; AVX512VL-LABEL: shuffle_v32i8_32_32_32_32_32_32_32_32_08_09_10_11_12_13_14_15_48_48_48_48_48_48_48_48_24_25_26_27_28_29_30_31:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpxord %ymm2, %ymm2, %ymm2			; AVX512VL-NEXT: vpxor %ymm2, %ymm2, %ymm2
	; AVX512VL-NEXT: vpshufb %ymm2, %ymm1, %ymm1			; AVX512VL-NEXT: vpshufb %ymm2, %ymm1, %ymm1
	; AVX512VL-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]			; AVX512VL-NEXT: vpblendd {{.*#+}} ymm0 = ymm1[0,1],ymm0[2,3],ymm1[4,5],ymm0[6,7]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 48, i32 48, i32 48, i32 48, i32 48, i32 48, i32 48, i32 48, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 48, i32 48, i32 48, i32 48, i32 48, i32 48, i32 48, i32 48, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_39_38_37_36_35_34_33_32_15_14_13_12_11_10_09_08_55_54_53_52_51_50_49_48_31_30_29_28_27_26_25_24(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_39_38_37_36_35_34_33_32_15_14_13_12_11_10_09_08_55_54_53_52_51_50_49_48_31_30_29_28_27_26_25_24(<32 x i8> %a, <32 x i8> %b) {
	▲ Show 20 Lines • Show All 206 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_00_32_01_33_02_34_03_35_04_36_05_37_06_38_07_39_24_56_25_57_26_58_27_59_28_60_29_61_30_62_31_63:			; AVX512VL-LABEL: shuffle_v32i8_00_32_01_33_02_34_03_35_04_36_05_37_06_38_07_39_24_56_25_57_26_58_27_59_28_60_29_61_30_62_31_63:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,u,1,u,2,u,3,u,4,u,5,u,6,u,7,u,24,u,25,u,26,u,27,u,28,u,29,u,30,u,31,u]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[0,u,1,u,2,u,3,u,4,u,5,u,6,u,7,u,24,u,25,u,26,u,27,u,28,u,29,u,30,u,31,u]
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[u,0,u,1,u,2,u,3,u,4,u,5,u,6,u,7,u,24,u,25,u,26,u,27,u,28,u,29,u,30,u,31]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[u,0,u,1,u,2,u,3,u,4,u,5,u,6,u,7,u,24,u,25,u,26,u,27,u,28,u,29,u,30,u,31]
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 32, i32 1, i32 33, i32 2, i32 34, i32 3, i32 35, i32 4, i32 36, i32 5, i32 37, i32 6, i32 38, i32 7, i32 39, i32 24, i32 56, i32 25, i32 57, i32 26, i32 58, i32 27, i32 59, i32 28, i32 60, i32 29, i32 61, i32 30, i32 62, i32 31, i32 63>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 0, i32 32, i32 1, i32 33, i32 2, i32 34, i32 3, i32 35, i32 4, i32 36, i32 5, i32 37, i32 6, i32 38, i32 7, i32 39, i32 24, i32 56, i32 25, i32 57, i32 26, i32 58, i32 27, i32 59, i32 28, i32 60, i32 29, i32 61, i32 30, i32 62, i32 31, i32 63>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_08_40_09_41_10_42_11_43_12_44_13_45_14_46_15_47_16_48_17_49_18_50_19_51_20_52_21_53_22_54_23_55(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_08_40_09_41_10_42_11_43_12_44_13_45_14_46_15_47_16_48_17_49_18_50_19_51_20_52_21_53_22_54_23_55(<32 x i8> %a, <32 x i8> %b) {
	; AVX1-LABEL: shuffle_v32i8_08_40_09_41_10_42_11_43_12_44_13_45_14_46_15_47_16_48_17_49_18_50_19_51_20_52_21_53_22_54_23_55:			; AVX1-LABEL: shuffle_v32i8_08_40_09_41_10_42_11_43_12_44_13_45_14_46_15_47_16_48_17_49_18_50_19_51_20_52_21_53_22_54_23_55:
	Show All 12 Lines
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_08_40_09_41_10_42_11_43_12_44_13_45_14_46_15_47_16_48_17_49_18_50_19_51_20_52_21_53_22_54_23_55:			; AVX512VL-LABEL: shuffle_v32i8_08_40_09_41_10_42_11_43_12_44_13_45_14_46_15_47_16_48_17_49_18_50_19_51_20_52_21_53_22_54_23_55:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[8,u,9,u,10,u,11,u,12,u,13,u,14,u,15,u,16,u,17,u,18,u,19,u,20,u,21,u,22,u,23,u]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[8,u,9,u,10,u,11,u,12,u,13,u,14,u,15,u,16,u,17,u,18,u,19,u,20,u,21,u,22,u,23,u]
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[u,8,u,9,u,10,u,11,u,12,u,13,u,14,u,15,u,16,u,17,u,18,u,19,u,20,u,21,u,22,u,23]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[u,8,u,9,u,10,u,11,u,12,u,13,u,14,u,15,u,16,u,17,u,18,u,19,u,20,u,21,u,22,u,23]
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 8, i32 40, i32 9, i32 41, i32 10, i32 42, i32 11, i32 43, i32 12, i32 44, i32 13, i32 45, i32 14, i32 46, i32 15, i32 47, i32 16, i32 48, i32 17, i32 49, i32 18, i32 50, i32 19, i32 51, i32 20, i32 52, i32 21, i32 53, i32 22, i32 54, i32 23, i32 55>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 8, i32 40, i32 9, i32 41, i32 10, i32 42, i32 11, i32 43, i32 12, i32 44, i32 13, i32 45, i32 14, i32 46, i32 15, i32 47, i32 16, i32 48, i32 17, i32 49, i32 18, i32 50, i32 19, i32 51, i32 20, i32 52, i32 21, i32 53, i32 22, i32 54, i32 23, i32 55>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_01_00_16_17_16_16_16_16_16_16_16_16_16_16_16_16_16_16(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_01_00_16_17_16_16_16_16_16_16_16_16_16_16_16_16_16_16(<32 x i8> %a, <32 x i8> %b) {
	; AVX1-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_01_00_16_17_16_16_16_16_16_16_16_16_16_16_16_16_16_16:			; AVX1-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_00_00_00_00_00_00_01_00_16_17_16_16_16_16_16_16_16_16_16_16_16_16_16_16:
	▲ Show 20 Lines • Show All 239 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v32i8_42_45_12_13_35_35_60_40_17_22_29_44_33_12_48_51_20_19_52_19_49_54_37_32_48_42_59_07_36_34_36_39:			; AVX512VL-LABEL: shuffle_v32i8_42_45_12_13_35_35_60_40_17_22_29_44_33_12_48_51_20_19_52_19_49_54_37_32_48_42_59_07_36_34_36_39:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm2 = ymm1[2,3,0,1]			; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm2 = ymm1[2,3,0,1]
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm2 = ymm2[u,u,u,u,u,u,12,u,u,u,u,u,u,u,0,3,u,u,u,u,u,u,21,16,u,26,u,u,20,18,20,23]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm2 = ymm2[u,u,u,u,u,u,12,u,u,u,u,u,u,u,0,3,u,u,u,u,u,u,21,16,u,26,u,u,20,18,20,23]
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[10,13,u,u,3,3,u,8,u,u,u,12,1,u,u,u,u,u,20,u,17,22,u,u,16,u,27,u,u,u,u,u]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm1 = ymm1[10,13,u,u,3,3,u,8,u,u,u,12,1,u,u,u,u,u,20,u,17,22,u,u,16,u,27,u,u,u,u,u]
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm3 = <255,255,u,u,255,255,0,255,u,u,u,255,255,u,0,0,u,u,255,u,255,255,0,0,255,0,255,u,0,0,0,0>			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm3 = <255,255,u,u,255,255,0,255,u,u,u,255,255,u,0,0,u,u,255,u,255,255,0,0,255,0,255,u,0,0,0,0>
	; AVX512VL-NEXT: vpblendvb %ymm3, %ymm1, %ymm2, %ymm1			; AVX512VL-NEXT: vpblendvb %ymm3, %ymm1, %ymm2, %ymm1
	; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm2 = ymm0[2,3,0,1]			; AVX512VL-NEXT: vperm2i128 {{.*#+}} ymm2 = ymm0[2,3,0,1]
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm2 = ymm2[u,u,u,u,u,u,u,u,1,6,13,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u,23,u,u,u,u]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm2 = ymm2[u,u,u,u,u,u,u,u,1,6,13,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u,23,u,u,u,u]
	; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[u,u,12,13,u,u,u,u,u,u,u,u,u,12,u,u,20,19,u,19,u,u,u,u,u,u,u,u,u,u,u,u]			; AVX512VL-NEXT: vpshufb {{.*#+}} ymm0 = ymm0[u,u,12,13,u,u,u,u,u,u,u,u,u,12,u,u,20,19,u,19,u,u,u,u,u,u,u,u,u,u,u,u]
	; AVX512VL-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1],ymm2[2],ymm0[3,4,5],ymm2[6],ymm0[7]			; AVX512VL-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1],ymm2[2],ymm0[3,4,5],ymm2[6],ymm0[7]
	; AVX512VL-NEXT: vmovdqu8 {{.*#+}} ymm2 = [255,255,0,0,255,255,255,255,0,0,0,255,255,0,255,255,0,0,255,0,255,255,255,255,255,255,255,0,255,255,255,255]			; AVX512VL-NEXT: vmovdqu {{.*#+}} ymm2 = [255,255,0,0,255,255,255,255,0,0,0,255,255,0,255,255,0,0,255,0,255,255,255,255,255,255,255,0,255,255,255,255]
	; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0			; AVX512VL-NEXT: vpblendvb %ymm2, %ymm1, %ymm0, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 42, i32 45, i32 12, i32 13, i32 35, i32 35, i32 60, i32 40, i32 17, i32 22, i32 29, i32 44, i32 33, i32 12, i32 48, i32 51, i32 20, i32 19, i32 52, i32 19, i32 49, i32 54, i32 37, i32 32, i32 48, i32 42, i32 59, i32 7, i32 36, i32 34, i32 36, i32 39>			%shuffle = shufflevector <32 x i8> %a, <32 x i8> %b, <32 x i32> <i32 42, i32 45, i32 12, i32 13, i32 35, i32 35, i32 60, i32 40, i32 17, i32 22, i32 29, i32 44, i32 33, i32 12, i32 48, i32 51, i32 20, i32 19, i32 52, i32 19, i32 49, i32 54, i32 37, i32 32, i32 48, i32 42, i32 59, i32 7, i32 36, i32 34, i32 36, i32 39>
	ret <32 x i8> %shuffle			ret <32 x i8> %shuffle
	}			}

	define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_08_08_08_08_08_08_08_08_32_32_32_32_32_32_32_32_40_40_40_40_40_40_40_40(<32 x i8> %a, <32 x i8> %b) {			define <32 x i8> @shuffle_v32i8_00_00_00_00_00_00_00_00_08_08_08_08_08_08_08_08_32_32_32_32_32_32_32_32_40_40_40_40_40_40_40_40(<32 x i8> %a, <32 x i8> %b) {
	; AVX1-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_08_08_08_08_08_08_08_08_32_32_32_32_32_32_32_32_40_40_40_40_40_40_40_40:			; AVX1-LABEL: shuffle_v32i8_00_00_00_00_00_00_00_00_08_08_08_08_08_08_08_08_32_32_32_32_32_32_32_32_40_40_40_40_40_40_40_40:
	▲ Show 20 Lines • Show All 727 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v8.ll

	Show First 20 Lines • Show All 1,043 Lines • ▼ Show 20 Lines
	; AVX2-LABEL: shuffle_v8i32_00040000:			; AVX2-LABEL: shuffle_v8i32_00040000:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,0,4,0,0,0,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,0,4,0,0,0,0]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_00040000:			; AVX512VL-LABEL: shuffle_v8i32_00040000:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,0,0,4,0,0,0,0]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,0,4,0,0,0,0]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 0, i32 4, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 0, i32 4, i32 0, i32 0, i32 0, i32 0>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_00500000(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_00500000(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_00500000:			; AVX1-LABEL: shuffle_v8i32_00500000:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vperm2f128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX1-NEXT: vperm2f128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX1-NEXT: vblendps {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]			; AVX1-NEXT: vblendps {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,1,0,4,4,4,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,1,0,4,4,4,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_00500000:			; AVX2-LABEL: shuffle_v8i32_00500000:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,5,0,0,0,0,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,5,0,0,0,0,0]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_00500000:			; AVX512VL-LABEL: shuffle_v8i32_00500000:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,0,5,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,5,0,0,0,0,0]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 5, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 5, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_06000000(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_06000000(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_06000000:			; AVX1-LABEL: shuffle_v8i32_06000000:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vperm2f128 {{.*#+}} ymm1 = ymm0[2,3,0,1]			; AVX1-NEXT: vperm2f128 {{.*#+}} ymm1 = ymm0[2,3,0,1]
	; AVX1-NEXT: vblendpd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3]			; AVX1-NEXT: vblendpd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3]
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,2,0,0,4,4,4,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,2,0,0,4,4,4,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_06000000:			; AVX2-LABEL: shuffle_v8i32_06000000:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,6,0,0,0,0,0,0]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,6,0,0,0,0,0,0]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_06000000:			; AVX512VL-LABEL: shuffle_v8i32_06000000:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,6,0,0,0,0,0,0]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,6,0,0,0,0,0,0]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 6, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 6, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_70000000(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_70000000(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_70000000:			; AVX1-LABEL: shuffle_v8i32_70000000:
	Show All 38 Lines
	; AVX2-LABEL: shuffle_v8i32_00112233:			; AVX2-LABEL: shuffle_v8i32_00112233:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,1,1,2,2,3,3]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,1,1,2,2,3,3]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_00112233:			; AVX512VL-LABEL: shuffle_v8i32_00112233:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,0,1,1,2,2,3,3]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,1,1,2,2,3,3]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 1, i32 1, i32 2, i32 2, i32 3, i32 3>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 1, i32 1, i32 2, i32 2, i32 3, i32 3>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_00001111(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_00001111(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_00001111:			; AVX1-LABEL: shuffle_v8i32_00001111:
	▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpermd %ymm1, %ymm2, %ymm1			; AVX2-NEXT: vpermd %ymm1, %ymm2, %ymm1
	; AVX2-NEXT: vpmovzxdq {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero			; AVX2-NEXT: vpmovzxdq {{.*#+}} ymm0 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1],ymm0[2],ymm1[3],ymm0[4],ymm1[5],ymm0[6],ymm1[7]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_08192a3b:			; AVX512VL-LABEL: shuffle_v8i32_08192a3b:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpmovzxdq {{.*#+}} ymm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero			; AVX512VL-NEXT: vpmovzxdq {{.*#+}} ymm2 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm0 = [0,8,2,9,4,10,6,11]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm0 = [0,8,2,9,4,10,6,11]
	; AVX512VL-NEXT: vpermi2d %ymm1, %ymm2, %ymm0			; AVX512VL-NEXT: vpermi2d %ymm1, %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_08991abb(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_08991abb(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_08991abb:			; AVX1-LABEL: shuffle_v8i32_08991abb:
	Show All 12 Lines
	; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero			; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero
	; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,1,1,3]			; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,1,1,3]
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3],ymm0[4],ymm1[5,6,7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3],ymm0[4],ymm1[5,6,7]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_08991abb:			; AVX512VL-LABEL: shuffle_v8i32_08991abb:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vpmovzxdq {{.*#+}} xmm2 = xmm0[0],zero,xmm0[1],zero			; AVX512VL-NEXT: vpmovzxdq {{.*#+}} xmm2 = xmm0[0],zero,xmm0[1],zero
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm0 = [8,0,1,1,10,2,3,3]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm0 = [8,0,1,1,10,2,3,3]
	; AVX512VL-NEXT: vpermi2d %ymm2, %ymm1, %ymm0			; AVX512VL-NEXT: vpermi2d %ymm2, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 8, i32 9, i32 9, i32 1, i32 10, i32 11, i32 11>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 8, i32 9, i32 9, i32 1, i32 10, i32 11, i32 11>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_091b2d3f(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_091b2d3f(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_091b2d3f:			; AVX1-LABEL: shuffle_v8i32_091b2d3f:
	▲ Show 20 Lines • Show All 222 Lines • ▼ Show 20 Lines
	; AVX2-LABEL: shuffle_v8i32_00015444:			; AVX2-LABEL: shuffle_v8i32_00015444:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,0,1,5,4,4,4]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,0,1,5,4,4,4]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_00015444:			; AVX512VL-LABEL: shuffle_v8i32_00015444:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,0,0,1,5,4,4,4]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,0,1,5,4,4,4]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 0, i32 1, i32 5, i32 4, i32 4, i32 4>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 0, i32 1, i32 5, i32 4, i32 4, i32 4>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_00204644(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_00204644(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_00204644:			; AVX1-LABEL: shuffle_v8i32_00204644:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,2,0,4,6,4,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,2,0,4,6,4,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_00204644:			; AVX2-LABEL: shuffle_v8i32_00204644:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,0,4,6,4,4]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,0,4,6,4,4]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_00204644:			; AVX512VL-LABEL: shuffle_v8i32_00204644:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,0,2,0,4,6,4,4]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,0,4,6,4,4]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 2, i32 0, i32 4, i32 6, i32 4, i32 4>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 2, i32 0, i32 4, i32 6, i32 4, i32 4>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_03004474(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_03004474(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_03004474:			; AVX1-LABEL: shuffle_v8i32_03004474:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,3,0,0,4,4,7,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,3,0,0,4,4,7,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_03004474:			; AVX2-LABEL: shuffle_v8i32_03004474:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,3,0,0,4,4,7,4]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,3,0,0,4,4,7,4]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_03004474:			; AVX512VL-LABEL: shuffle_v8i32_03004474:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,3,0,0,4,4,7,4]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,3,0,0,4,4,7,4]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 3, i32 0, i32 0, i32 4, i32 4, i32 7, i32 4>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 3, i32 0, i32 0, i32 4, i32 4, i32 7, i32 4>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_10004444(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_10004444(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_10004444:			; AVX1-LABEL: shuffle_v8i32_10004444:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,0,0,0,4,4,4,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,0,0,0,4,4,4,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_10004444:			; AVX2-LABEL: shuffle_v8i32_10004444:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [1,0,0,0,4,4,4,4]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [1,0,0,0,4,4,4,4]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_10004444:			; AVX512VL-LABEL: shuffle_v8i32_10004444:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [1,0,0,0,4,4,4,4]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [1,0,0,0,4,4,4,4]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 0, i32 0, i32 0, i32 4, i32 4, i32 4, i32 4>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 0, i32 0, i32 0, i32 4, i32 4, i32 4, i32 4>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_22006446(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_22006446(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_22006446:			; AVX1-LABEL: shuffle_v8i32_22006446:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[2,2,0,0,6,4,4,6]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[2,2,0,0,6,4,4,6]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_22006446:			; AVX2-LABEL: shuffle_v8i32_22006446:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [2,2,0,0,6,4,4,6]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [2,2,0,0,6,4,4,6]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_22006446:			; AVX512VL-LABEL: shuffle_v8i32_22006446:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [2,2,0,0,6,4,4,6]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [2,2,0,0,6,4,4,6]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 2, i32 2, i32 0, i32 0, i32 6, i32 4, i32 4, i32 6>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 2, i32 2, i32 0, i32 0, i32 6, i32 4, i32 4, i32 6>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_33307474(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_33307474(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_33307474:			; AVX1-LABEL: shuffle_v8i32_33307474:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[3,3,3,0,7,4,7,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[3,3,3,0,7,4,7,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_33307474:			; AVX2-LABEL: shuffle_v8i32_33307474:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [3,3,3,0,7,4,7,4]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [3,3,3,0,7,4,7,4]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_33307474:			; AVX512VL-LABEL: shuffle_v8i32_33307474:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [3,3,3,0,7,4,7,4]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [3,3,3,0,7,4,7,4]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 3, i32 3, i32 3, i32 0, i32 7, i32 4, i32 7, i32 4>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 3, i32 3, i32 3, i32 0, i32 7, i32 4, i32 7, i32 4>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_32104567(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_32104567(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_32104567:			; AVX1-LABEL: shuffle_v8i32_32104567:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[3,2,1,0,4,5,6,7]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[3,2,1,0,4,5,6,7]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_32104567:			; AVX2-LABEL: shuffle_v8i32_32104567:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [3,2,1,0,4,5,6,7]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [3,2,1,0,4,5,6,7]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_32104567:			; AVX512VL-LABEL: shuffle_v8i32_32104567:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [3,2,1,0,4,5,6,7]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [3,2,1,0,4,5,6,7]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 3, i32 2, i32 1, i32 0, i32 4, i32 5, i32 6, i32 7>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 3, i32 2, i32 1, i32 0, i32 4, i32 5, i32 6, i32 7>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_00236744(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_00236744(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_00236744:			; AVX1-LABEL: shuffle_v8i32_00236744:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,2,3,6,7,4,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,2,3,6,7,4,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_00236744:			; AVX2-LABEL: shuffle_v8i32_00236744:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,3,6,7,4,4]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,3,6,7,4,4]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_00236744:			; AVX512VL-LABEL: shuffle_v8i32_00236744:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,0,2,3,6,7,4,4]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,3,6,7,4,4]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 2, i32 3, i32 6, i32 7, i32 4, i32 4>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 2, i32 3, i32 6, i32 7, i32 4, i32 4>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_00226644(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_00226644(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_00226644:			; AVX1-LABEL: shuffle_v8i32_00226644:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,2,2,6,6,4,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,2,2,6,6,4,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_00226644:			; AVX2-LABEL: shuffle_v8i32_00226644:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,2,6,6,4,4]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,2,6,6,4,4]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_00226644:			; AVX512VL-LABEL: shuffle_v8i32_00226644:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,0,2,2,6,6,4,4]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,0,2,2,6,6,4,4]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 2, i32 2, i32 6, i32 6, i32 4, i32 4>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 2, i32 2, i32 6, i32 6, i32 4, i32 4>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_10324567(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_10324567(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_10324567:			; AVX1-LABEL: shuffle_v8i32_10324567:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,0,3,2,4,5,6,7]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,0,3,2,4,5,6,7]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_10324567:			; AVX2-LABEL: shuffle_v8i32_10324567:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [1,0,3,2,4,5,6,7]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [1,0,3,2,4,5,6,7]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_10324567:			; AVX512VL-LABEL: shuffle_v8i32_10324567:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [1,0,3,2,4,5,6,7]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [1,0,3,2,4,5,6,7]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 0, i32 3, i32 2, i32 4, i32 5, i32 6, i32 7>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 0, i32 3, i32 2, i32 4, i32 5, i32 6, i32 7>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_11334567(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_11334567(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_11334567:			; AVX1-LABEL: shuffle_v8i32_11334567:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,1,3,3,4,5,6,7]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,1,3,3,4,5,6,7]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_11334567:			; AVX2-LABEL: shuffle_v8i32_11334567:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [1,1,3,3,4,5,6,7]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [1,1,3,3,4,5,6,7]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_11334567:			; AVX512VL-LABEL: shuffle_v8i32_11334567:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [1,1,3,3,4,5,6,7]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [1,1,3,3,4,5,6,7]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 1, i32 3, i32 3, i32 4, i32 5, i32 6, i32 7>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 1, i32 3, i32 3, i32 4, i32 5, i32 6, i32 7>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_01235467(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_01235467(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_01235467:			; AVX1-LABEL: shuffle_v8i32_01235467:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,1,2,3,5,4,6,7]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,1,2,3,5,4,6,7]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_01235467:			; AVX2-LABEL: shuffle_v8i32_01235467:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,1,2,3,5,4,6,7]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,1,2,3,5,4,6,7]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_01235467:			; AVX512VL-LABEL: shuffle_v8i32_01235467:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,1,2,3,5,4,6,7]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,1,2,3,5,4,6,7]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 5, i32 4, i32 6, i32 7>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 5, i32 4, i32 6, i32 7>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_01235466(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_01235466(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_01235466:			; AVX1-LABEL: shuffle_v8i32_01235466:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,1,2,3,5,4,6,6]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,1,2,3,5,4,6,6]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_01235466:			; AVX2-LABEL: shuffle_v8i32_01235466:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,1,2,3,5,4,6,6]			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = [0,1,2,3,5,4,6,6]
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_01235466:			; AVX512VL-LABEL: shuffle_v8i32_01235466:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = [0,1,2,3,5,4,6,6]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = [0,1,2,3,5,4,6,6]
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 5, i32 4, i32 6, i32 6>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 5, i32 4, i32 6, i32 6>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_002u6u44(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_002u6u44(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_002u6u44:			; AVX1-LABEL: shuffle_v8i32_002u6u44:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,2,u,6,u,4,4]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,2,u,6,u,4,4]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_002u6u44:			; AVX2-LABEL: shuffle_v8i32_002u6u44:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <0,0,2,u,6,u,4,4>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <0,0,2,u,6,u,4,4>
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_002u6u44:			; AVX512VL-LABEL: shuffle_v8i32_002u6u44:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = <0,0,2,u,6,u,4,4>			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = <0,0,2,u,6,u,4,4>
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 2, i32 undef, i32 6, i32 undef, i32 4, i32 4>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 2, i32 undef, i32 6, i32 undef, i32 4, i32 4>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_00uu66uu(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_00uu66uu(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_00uu66uu:			; AVX1-LABEL: shuffle_v8i32_00uu66uu:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,u,u,6,6,u,u]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,0,u,u,6,6,u,u]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_00uu66uu:			; AVX2-LABEL: shuffle_v8i32_00uu66uu:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <0,0,u,u,6,6,u,u>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <0,0,u,u,6,6,u,u>
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_00uu66uu:			; AVX512VL-LABEL: shuffle_v8i32_00uu66uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = <0,0,u,u,6,6,u,u>			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = <0,0,u,u,6,6,u,u>
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 undef, i32 undef, i32 6, i32 6, i32 undef, i32 undef>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 0, i32 undef, i32 undef, i32 6, i32 6, i32 undef, i32 undef>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_103245uu(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_103245uu(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_103245uu:			; AVX1-LABEL: shuffle_v8i32_103245uu:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,0,3,2,4,5,u,u]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,0,3,2,4,5,u,u]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_103245uu:			; AVX2-LABEL: shuffle_v8i32_103245uu:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <1,0,3,2,4,5,u,u>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <1,0,3,2,4,5,u,u>
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_103245uu:			; AVX512VL-LABEL: shuffle_v8i32_103245uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = <1,0,3,2,4,5,u,u>			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = <1,0,3,2,4,5,u,u>
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 0, i32 3, i32 2, i32 4, i32 5, i32 undef, i32 undef>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 0, i32 3, i32 2, i32 4, i32 5, i32 undef, i32 undef>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_1133uu67(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_1133uu67(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_1133uu67:			; AVX1-LABEL: shuffle_v8i32_1133uu67:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,1,3,3,u,u,6,7]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[1,1,3,3,u,u,6,7]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_1133uu67:			; AVX2-LABEL: shuffle_v8i32_1133uu67:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <1,1,3,3,u,u,6,7>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <1,1,3,3,u,u,6,7>
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_1133uu67:			; AVX512VL-LABEL: shuffle_v8i32_1133uu67:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = <1,1,3,3,u,u,6,7>			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = <1,1,3,3,u,u,6,7>
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 1, i32 3, i32 3, i32 undef, i32 undef, i32 6, i32 7>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 1, i32 3, i32 3, i32 undef, i32 undef, i32 6, i32 7>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_0uu354uu(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_0uu354uu(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_0uu354uu:			; AVX1-LABEL: shuffle_v8i32_0uu354uu:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,u,u,3,5,4,u,u]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[0,u,u,3,5,4,u,u]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_0uu354uu:			; AVX2-LABEL: shuffle_v8i32_0uu354uu:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <0,u,u,3,5,4,u,u>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <0,u,u,3,5,4,u,u>
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_0uu354uu:			; AVX512VL-LABEL: shuffle_v8i32_0uu354uu:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = <0,u,u,3,5,4,u,u>			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = <0,u,u,3,5,4,u,u>
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 undef, i32 undef, i32 3, i32 5, i32 4, i32 undef, i32 undef>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 undef, i32 undef, i32 3, i32 5, i32 4, i32 undef, i32 undef>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_uuu3uu66(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_uuu3uu66(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_uuu3uu66:			; AVX1-LABEL: shuffle_v8i32_uuu3uu66:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[u,u,u,3,u,u,6,6]			; AVX1-NEXT: vpermilps {{.*#+}} ymm0 = ymm0[u,u,u,3,u,u,6,6]
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: shuffle_v8i32_uuu3uu66:			; AVX2-LABEL: shuffle_v8i32_uuu3uu66:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <u,u,u,3,u,u,6,6>			; AVX2-NEXT: vmovdqa {{.*#+}} ymm1 = <u,u,u,3,u,u,6,6>
	; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_uuu3uu66:			; AVX512VL-LABEL: shuffle_v8i32_uuu3uu66:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm1 = <u,u,u,3,u,u,6,6>			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm1 = <u,u,u,3,u,u,6,6>
	; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0			; AVX512VL-NEXT: vpermd %ymm0, %ymm1, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 undef, i32 undef, i32 undef, i32 3, i32 undef, i32 undef, i32 6, i32 6>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 undef, i32 undef, i32 undef, i32 3, i32 undef, i32 undef, i32 6, i32 6>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_6caa87e5(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_6caa87e5(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_6caa87e5:			; AVX1-LABEL: shuffle_v8i32_6caa87e5:
	Show All 10 Lines
	; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[3,1,3,2]			; AVX2-NEXT: vpermq {{.*#+}} ymm0 = ymm0[3,1,3,2]
	; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[0,0,2,2,4,4,6,6]			; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[0,0,2,2,4,4,6,6]
	; AVX2-NEXT: vpermq {{.*#+}} ymm1 = ymm1[2,1,0,3]			; AVX2-NEXT: vpermq {{.*#+}} ymm1 = ymm1[2,1,0,3]
	; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3,4],ymm0[5],ymm1[6],ymm0[7]			; AVX2-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3,4],ymm0[5],ymm1[6],ymm0[7]
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512VL-LABEL: shuffle_v8i32_6caa87e5:			; AVX512VL-LABEL: shuffle_v8i32_6caa87e5:
	; AVX512VL: # BB#0:			; AVX512VL: # BB#0:
	; AVX512VL-NEXT: vmovdqa32 {{.*#+}} ymm2 = [14,4,2,2,0,15,6,13]			; AVX512VL-NEXT: vmovdqa {{.*#+}} ymm2 = [14,4,2,2,0,15,6,13]
	; AVX512VL-NEXT: vpermi2d %ymm0, %ymm1, %ymm2			; AVX512VL-NEXT: vpermi2d %ymm0, %ymm1, %ymm2
	; AVX512VL-NEXT: vmovdqa64 %ymm2, %ymm0			; AVX512VL-NEXT: vmovdqa %ymm2, %ymm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 6, i32 12, i32 10, i32 10, i32 8, i32 7, i32 14, i32 5>			%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 6, i32 12, i32 10, i32 10, i32 8, i32 7, i32 14, i32 5>
	ret <8 x i32> %shuffle			ret <8 x i32> %shuffle
	}			}

	define <8 x i32> @shuffle_v8i32_32103210(<8 x i32> %a, <8 x i32> %b) {			define <8 x i32> @shuffle_v8i32_32103210(<8 x i32> %a, <8 x i32> %b) {
	; AVX1-LABEL: shuffle_v8i32_32103210:			; AVX1-LABEL: shuffle_v8i32_32103210:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	▲ Show 20 Lines • Show All 587 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll

Show All 16 Lines	; X64-NEXT: retq
%res0 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 15, i16 14, i16 13, i16 12, i16 11, i16 10, i16 9, i16 8, i16 7, i16 6, i16 5, i16 4, i16 3, i16 2, i16 1, i16 0>, <16 x i16> %x0, <16 x i16> %x1, i16 -1)		%res0 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 15, i16 14, i16 13, i16 12, i16 11, i16 10, i16 9, i16 8, i16 7, i16 6, i16 5, i16 4, i16 3, i16 2, i16 1, i16 0>, <16 x i16> %x0, <16 x i16> %x1, i16 -1)
%res1 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 15, i16 30, i16 13, i16 28, i16 11, i16 26, i16 9, i16 24, i16 7, i16 22, i16 5, i16 20, i16 3, i16 18, i16 1, i16 16>, <16 x i16> %res0, <16 x i16> %res0, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 15, i16 30, i16 13, i16 28, i16 11, i16 26, i16 9, i16 24, i16 7, i16 22, i16 5, i16 20, i16 3, i16 18, i16 1, i16 16>, <16 x i16> %res0, <16 x i16> %res0, i16 -1)
ret <16 x i16> %res1		ret <16 x i16> %res1
}		}
define <16 x i16> @combine_vpermt2var_16i16_identity_mask(<16 x i16> %x0, <16 x i16> %x1, i16 %m) {		define <16 x i16> @combine_vpermt2var_16i16_identity_mask(<16 x i16> %x0, <16 x i16> %x1, i16 %m) {
; X32-LABEL: combine_vpermt2var_16i16_identity_mask:		; X32-LABEL: combine_vpermt2var_16i16_identity_mask:
; X32: # BB#0:		; X32: # BB#0:
; X32-NEXT: kmovw {{[0-9]+}}(%esp), %k1		; X32-NEXT: kmovw {{[0-9]+}}(%esp), %k1
; X32-NEXT: vmovdqu16 {{.*#+}} ymm2 = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]		; X32-NEXT: vmovdqu {{.*#+}} ymm2 = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]
; X32-NEXT: vpermi2w %ymm1, %ymm0, %ymm2 {%k1} {z}		; X32-NEXT: vpermi2w %ymm1, %ymm0, %ymm2 {%k1} {z}
; X32-NEXT: vmovdqu16 {{.*#+}} ymm0 = [15,30,13,28,11,26,9,24,7,22,5,20,3,18,1,16]		; X32-NEXT: vmovdqu {{.*#+}} ymm0 = [15,30,13,28,11,26,9,24,7,22,5,20,3,18,1,16]
; X32-NEXT: vpermi2w %ymm2, %ymm2, %ymm0 {%k1} {z}		; X32-NEXT: vpermi2w %ymm2, %ymm2, %ymm0 {%k1} {z}
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: combine_vpermt2var_16i16_identity_mask:		; X64-LABEL: combine_vpermt2var_16i16_identity_mask:
; X64: # BB#0:		; X64: # BB#0:
; X64-NEXT: kmovw %edi, %k1		; X64-NEXT: kmovw %edi, %k1
; X64-NEXT: vmovdqu16 {{.*#+}} ymm2 = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]		; X64-NEXT: vmovdqu {{.*#+}} ymm2 = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]
; X64-NEXT: vpermi2w %ymm1, %ymm0, %ymm2 {%k1} {z}		; X64-NEXT: vpermi2w %ymm1, %ymm0, %ymm2 {%k1} {z}
; X64-NEXT: vmovdqu16 {{.*#+}} ymm0 = [15,30,13,28,11,26,9,24,7,22,5,20,3,18,1,16]		; X64-NEXT: vmovdqu {{.*#+}} ymm0 = [15,30,13,28,11,26,9,24,7,22,5,20,3,18,1,16]
; X64-NEXT: vpermi2w %ymm2, %ymm2, %ymm0 {%k1} {z}		; X64-NEXT: vpermi2w %ymm2, %ymm2, %ymm0 {%k1} {z}
; X64-NEXT: retq		; X64-NEXT: retq
%res0 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 15, i16 14, i16 13, i16 12, i16 11, i16 10, i16 9, i16 8, i16 7, i16 6, i16 5, i16 4, i16 3, i16 2, i16 1, i16 0>, <16 x i16> %x0, <16 x i16> %x1, i16 %m)		%res0 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 15, i16 14, i16 13, i16 12, i16 11, i16 10, i16 9, i16 8, i16 7, i16 6, i16 5, i16 4, i16 3, i16 2, i16 1, i16 0>, <16 x i16> %x0, <16 x i16> %x1, i16 %m)
%res1 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 15, i16 30, i16 13, i16 28, i16 11, i16 26, i16 9, i16 24, i16 7, i16 22, i16 5, i16 20, i16 3, i16 18, i16 1, i16 16>, <16 x i16> %res0, <16 x i16> %res0, i16 %m)		%res1 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 15, i16 30, i16 13, i16 28, i16 11, i16 26, i16 9, i16 24, i16 7, i16 22, i16 5, i16 20, i16 3, i16 18, i16 1, i16 16>, <16 x i16> %res0, <16 x i16> %res0, i16 %m)
ret <16 x i16> %res1		ret <16 x i16> %res1
}		}

define <16 x i16> @combine_vpermi2var_16i16_as_permw(<16 x i16> %x0, <16 x i16> %x1) {		define <16 x i16> @combine_vpermi2var_16i16_as_permw(<16 x i16> %x0, <16 x i16> %x1) {
; X32-LABEL: combine_vpermi2var_16i16_as_permw:		; X32-LABEL: combine_vpermi2var_16i16_as_permw:
; X32: # BB#0:		; X32: # BB#0:
; X32-NEXT: vmovdqu16 {{.*#+}} ymm1 = [15,0,14,1,13,2,12,3,11,4,10,5,9,6,8,7]		; X32-NEXT: vmovdqu {{.*#+}} ymm1 = [15,0,14,1,13,2,12,3,11,4,10,5,9,6,8,7]
; X32-NEXT: vpermw %ymm0, %ymm1, %ymm0		; X32-NEXT: vpermw %ymm0, %ymm1, %ymm0
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: combine_vpermi2var_16i16_as_permw:		; X64-LABEL: combine_vpermi2var_16i16_as_permw:
; X64: # BB#0:		; X64: # BB#0:
; X64-NEXT: vmovdqu16 {{.*#+}} ymm1 = [15,0,14,1,13,2,12,3,11,4,10,5,9,6,8,7]		; X64-NEXT: vmovdqu {{.*#+}} ymm1 = [15,0,14,1,13,2,12,3,11,4,10,5,9,6,8,7]
; X64-NEXT: vpermw %ymm0, %ymm1, %ymm0		; X64-NEXT: vpermw %ymm0, %ymm1, %ymm0
; X64-NEXT: retq		; X64-NEXT: retq
%res0 = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %x0, <16 x i16> <i16 15, i16 14, i16 13, i16 12, i16 11, i16 10, i16 9, i16 8, i16 7, i16 6, i16 5, i16 4, i16 3, i16 2, i16 1, i16 0>, <16 x i16> %x1, i16 -1)		%res0 = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %x0, <16 x i16> <i16 15, i16 14, i16 13, i16 12, i16 11, i16 10, i16 9, i16 8, i16 7, i16 6, i16 5, i16 4, i16 3, i16 2, i16 1, i16 0>, <16 x i16> %x1, i16 -1)
%res1 = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %res0, <16 x i16> <i16 0, i16 15, i16 1, i16 14, i16 2, i16 13, i16 3, i16 12, i16 4, i16 11, i16 5, i16 10, i16 6, i16 9, i16 7, i16 8>, <16 x i16> %res0, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %res0, <16 x i16> <i16 0, i16 15, i16 1, i16 14, i16 2, i16 13, i16 3, i16 12, i16 4, i16 11, i16 5, i16 10, i16 6, i16 9, i16 7, i16 8>, <16 x i16> %res0, i16 -1)
ret <16 x i16> %res1		ret <16 x i16> %res1
}		}

define <16 x i16> @combine_vpermt2var_vpermi2var_16i16_as_vperm2(<16 x i16> %x0, <16 x i16> %x1) {		define <16 x i16> @combine_vpermt2var_vpermi2var_16i16_as_vperm2(<16 x i16> %x0, <16 x i16> %x1) {
; X32-LABEL: combine_vpermt2var_vpermi2var_16i16_as_vperm2:		; X32-LABEL: combine_vpermt2var_vpermi2var_16i16_as_vperm2:
; X32: # BB#0:		; X32: # BB#0:
; X32-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,31,2,2,4,29,6,27,8,25,10,23,12,21,14,19]		; X32-NEXT: vmovdqu {{.*#+}} ymm2 = [0,31,2,2,4,29,6,27,8,25,10,23,12,21,14,19]
; X32-NEXT: vpermt2w %ymm1, %ymm2, %ymm0		; X32-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: combine_vpermt2var_vpermi2var_16i16_as_vperm2:		; X64-LABEL: combine_vpermt2var_vpermi2var_16i16_as_vperm2:
; X64: # BB#0:		; X64: # BB#0:
; X64-NEXT: vmovdqu16 {{.*#+}} ymm2 = [0,31,2,2,4,29,6,27,8,25,10,23,12,21,14,19]		; X64-NEXT: vmovdqu {{.*#+}} ymm2 = [0,31,2,2,4,29,6,27,8,25,10,23,12,21,14,19]
; X64-NEXT: vpermt2w %ymm1, %ymm2, %ymm0		; X64-NEXT: vpermt2w %ymm1, %ymm2, %ymm0
; X64-NEXT: retq		; X64-NEXT: retq
%res0 = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %x0, <16 x i16> <i16 0, i16 31, i16 2, i16 29, i16 4, i16 27, i16 6, i16 25, i16 8, i16 23, i16 10, i16 21, i16 12, i16 19, i16 14, i16 17>, <16 x i16> %x1, i16 -1)		%res0 = call <16 x i16> @llvm.x86.avx512.mask.vpermi2var.hi.256(<16 x i16> %x0, <16 x i16> <i16 0, i16 31, i16 2, i16 29, i16 4, i16 27, i16 6, i16 25, i16 8, i16 23, i16 10, i16 21, i16 12, i16 19, i16 14, i16 17>, <16 x i16> %x1, i16 -1)
%res1 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 0, i16 17, i16 2, i16 18, i16 4, i16 19, i16 6, i16 21, i16 8, i16 23, i16 10, i16 25, i16 12, i16 27, i16 14, i16 29>, <16 x i16> %res0, <16 x i16> %res0, i16 -1)		%res1 = call <16 x i16> @llvm.x86.avx512.maskz.vpermt2var.hi.256(<16 x i16> <i16 0, i16 17, i16 2, i16 18, i16 4, i16 19, i16 6, i16 21, i16 8, i16 23, i16 10, i16 25, i16 12, i16 27, i16 14, i16 29>, <16 x i16> %res0, <16 x i16> %res0, i16 -1)
ret <16 x i16> %res1		ret <16 x i16> %res1
}		}

define <16 x i16> @combine_vpermt2var_vpermi2var_16i16_as_unpckhwd(<16 x i16> %a0, <16 x i16> %a1) {		define <16 x i16> @combine_vpermt2var_vpermi2var_16i16_as_unpckhwd(<16 x i16> %a0, <16 x i16> %a1) {
Show All 26 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-combining-avx512vbmi.ll

Show All 31 Lines	; X64-NEXT: retq
%res0 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 15, i8 14, i8 13, i8 12, i8 11, i8 10, i8 9, i8 8, i8 7, i8 6, i8 5, i8 4, i8 3, i8 2, i8 1, i8 0>, <16 x i8> %x0, <16 x i8> %x1, i16 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 15, i8 14, i8 13, i8 12, i8 11, i8 10, i8 9, i8 8, i8 7, i8 6, i8 5, i8 4, i8 3, i8 2, i8 1, i8 0>, <16 x i8> %x0, <16 x i8> %x1, i16 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 15, i8 30, i8 13, i8 28, i8 11, i8 26, i8 9, i8 24, i8 7, i8 22, i8 5, i8 20, i8 3, i8 18, i8 1, i8 16>, <16 x i8> %res0, <16 x i8> %res0, i16 -1)		%res1 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 15, i8 30, i8 13, i8 28, i8 11, i8 26, i8 9, i8 24, i8 7, i8 22, i8 5, i8 20, i8 3, i8 18, i8 1, i8 16>, <16 x i8> %res0, <16 x i8> %res0, i16 -1)
ret <16 x i8> %res1		ret <16 x i8> %res1
}		}
define <16 x i8> @combine_vpermt2var_16i8_identity_mask(<16 x i8> %x0, <16 x i8> %x1, i16 %m) {		define <16 x i8> @combine_vpermt2var_16i8_identity_mask(<16 x i8> %x0, <16 x i8> %x1, i16 %m) {
; X32-LABEL: combine_vpermt2var_16i8_identity_mask:		; X32-LABEL: combine_vpermt2var_16i8_identity_mask:
; X32: # BB#0:		; X32: # BB#0:
; X32-NEXT: kmovw {{[0-9]+}}(%esp), %k1		; X32-NEXT: kmovw {{[0-9]+}}(%esp), %k1
; X32-NEXT: vmovdqu8 {{.*#+}} xmm2 = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]		; X32-NEXT: vmovdqu {{.*#+}} xmm2 = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]
; X32-NEXT: vpermi2b %xmm1, %xmm0, %xmm2 {%k1} {z}		; X32-NEXT: vpermi2b %xmm1, %xmm0, %xmm2 {%k1} {z}
; X32-NEXT: vmovdqu8 {{.*#+}} xmm0 = [15,30,13,28,11,26,9,24,7,22,5,20,3,18,1,16]		; X32-NEXT: vmovdqu {{.*#+}} xmm0 = [15,30,13,28,11,26,9,24,7,22,5,20,3,18,1,16]
; X32-NEXT: vpermi2b %xmm2, %xmm2, %xmm0 {%k1} {z}		; X32-NEXT: vpermi2b %xmm2, %xmm2, %xmm0 {%k1} {z}
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: combine_vpermt2var_16i8_identity_mask:		; X64-LABEL: combine_vpermt2var_16i8_identity_mask:
; X64: # BB#0:		; X64: # BB#0:
; X64-NEXT: kmovw %edi, %k1		; X64-NEXT: kmovw %edi, %k1
; X64-NEXT: vmovdqu8 {{.*#+}} xmm2 = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]		; X64-NEXT: vmovdqu {{.*#+}} xmm2 = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]
; X64-NEXT: vpermi2b %xmm1, %xmm0, %xmm2 {%k1} {z}		; X64-NEXT: vpermi2b %xmm1, %xmm0, %xmm2 {%k1} {z}
; X64-NEXT: vmovdqu8 {{.*#+}} xmm0 = [15,30,13,28,11,26,9,24,7,22,5,20,3,18,1,16]		; X64-NEXT: vmovdqu {{.*#+}} xmm0 = [15,30,13,28,11,26,9,24,7,22,5,20,3,18,1,16]
; X64-NEXT: vpermi2b %xmm2, %xmm2, %xmm0 {%k1} {z}		; X64-NEXT: vpermi2b %xmm2, %xmm2, %xmm0 {%k1} {z}
; X64-NEXT: retq		; X64-NEXT: retq
%res0 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 15, i8 14, i8 13, i8 12, i8 11, i8 10, i8 9, i8 8, i8 7, i8 6, i8 5, i8 4, i8 3, i8 2, i8 1, i8 0>, <16 x i8> %x0, <16 x i8> %x1, i16 %m)		%res0 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 15, i8 14, i8 13, i8 12, i8 11, i8 10, i8 9, i8 8, i8 7, i8 6, i8 5, i8 4, i8 3, i8 2, i8 1, i8 0>, <16 x i8> %x0, <16 x i8> %x1, i16 %m)
%res1 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 15, i8 30, i8 13, i8 28, i8 11, i8 26, i8 9, i8 24, i8 7, i8 22, i8 5, i8 20, i8 3, i8 18, i8 1, i8 16>, <16 x i8> %res0, <16 x i8> %res0, i16 %m)		%res1 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 15, i8 30, i8 13, i8 28, i8 11, i8 26, i8 9, i8 24, i8 7, i8 22, i8 5, i8 20, i8 3, i8 18, i8 1, i8 16>, <16 x i8> %res0, <16 x i8> %res0, i16 %m)
ret <16 x i8> %res1		ret <16 x i8> %res1
}		}

define <16 x i8> @combine_vpermi2var_16i8_as_vpshufb(<16 x i8> %x0, <16 x i8> %x1) {		define <16 x i8> @combine_vpermi2var_16i8_as_vpshufb(<16 x i8> %x0, <16 x i8> %x1) {
; X32-LABEL: combine_vpermi2var_16i8_as_vpshufb:		; X32-LABEL: combine_vpermi2var_16i8_as_vpshufb:
; X32: # BB#0:		; X32: # BB#0:
; X32-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[15,0,14,1,13,2,12,3,11,4,10,5,9,6,8,7]		; X32-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[15,0,14,1,13,2,12,3,11,4,10,5,9,6,8,7]
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: combine_vpermi2var_16i8_as_vpshufb:		; X64-LABEL: combine_vpermi2var_16i8_as_vpshufb:
; X64: # BB#0:		; X64: # BB#0:
; X64-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[15,0,14,1,13,2,12,3,11,4,10,5,9,6,8,7]		; X64-NEXT: vpshufb {{.*#+}} xmm0 = xmm0[15,0,14,1,13,2,12,3,11,4,10,5,9,6,8,7]
; X64-NEXT: retq		; X64-NEXT: retq
%res0 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> <i8 15, i8 14, i8 13, i8 12, i8 11, i8 10, i8 9, i8 8, i8 7, i8 6, i8 5, i8 4, i8 3, i8 2, i8 1, i8 0>, <16 x i8> %x1, i16 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> <i8 15, i8 14, i8 13, i8 12, i8 11, i8 10, i8 9, i8 8, i8 7, i8 6, i8 5, i8 4, i8 3, i8 2, i8 1, i8 0>, <16 x i8> %x1, i16 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %res0, <16 x i8> <i8 0, i8 15, i8 1, i8 14, i8 2, i8 13, i8 3, i8 12, i8 4, i8 11, i8 5, i8 10, i8 6, i8 9, i8 7, i8 8>, <16 x i8> %res0, i16 -1)		%res1 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %res0, <16 x i8> <i8 0, i8 15, i8 1, i8 14, i8 2, i8 13, i8 3, i8 12, i8 4, i8 11, i8 5, i8 10, i8 6, i8 9, i8 7, i8 8>, <16 x i8> %res0, i16 -1)
ret <16 x i8> %res1		ret <16 x i8> %res1
}		}
define <32 x i8> @combine_vpermi2var_32i8_as_vpermb(<32 x i8> %x0, <32 x i8> %x1) {		define <32 x i8> @combine_vpermi2var_32i8_as_vpermb(<32 x i8> %x0, <32 x i8> %x1) {
; X32-LABEL: combine_vpermi2var_32i8_as_vpermb:		; X32-LABEL: combine_vpermi2var_32i8_as_vpermb:
; X32: # BB#0:		; X32: # BB#0:
; X32-NEXT: vmovdqu8 {{.*#+}} ymm1 = [0,0,1,23,2,22,3,21,4,22,5,21,6,20,7,19,0,0,1,23,2,22,3,21,4,22,5,21,6,20,7,19]		; X32-NEXT: vmovdqu {{.*#+}} ymm1 = [0,0,1,23,2,22,3,21,4,22,5,21,6,20,7,19,0,0,1,23,2,22,3,21,4,22,5,21,6,20,7,19]
; X32-NEXT: vpermb %ymm0, %ymm1, %ymm0		; X32-NEXT: vpermb %ymm0, %ymm1, %ymm0
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: combine_vpermi2var_32i8_as_vpermb:		; X64-LABEL: combine_vpermi2var_32i8_as_vpermb:
; X64: # BB#0:		; X64: # BB#0:
; X64-NEXT: vmovdqu8 {{.*#+}} ymm1 = [0,0,1,23,2,22,3,21,4,22,5,21,6,20,7,19,0,0,1,23,2,22,3,21,4,22,5,21,6,20,7,19]		; X64-NEXT: vmovdqu {{.*#+}} ymm1 = [0,0,1,23,2,22,3,21,4,22,5,21,6,20,7,19,0,0,1,23,2,22,3,21,4,22,5,21,6,20,7,19]
; X64-NEXT: vpermb %ymm0, %ymm1, %ymm0		; X64-NEXT: vpermb %ymm0, %ymm1, %ymm0
; X64-NEXT: retq		; X64-NEXT: retq
%res0 = shufflevector <32 x i8> %x0, <32 x i8> %x1, <32 x i32> <i32 0, i32 32, i32 1, i32 33, i32 2, i32 34, i32 3, i32 35, i32 4, i32 36, i32 5, i32 37, i32 6, i32 38, i32 7, i32 39, i32 16, i32 48, i32 17, i32 49, i32 18, i32 50, i32 19, i32 51, i32 20, i32 52, i32 21, i32 53, i32 22, i32 54, i32 23, i32 55>		%res0 = shufflevector <32 x i8> %x0, <32 x i8> %x1, <32 x i32> <i32 0, i32 32, i32 1, i32 33, i32 2, i32 34, i32 3, i32 35, i32 4, i32 36, i32 5, i32 37, i32 6, i32 38, i32 7, i32 39, i32 16, i32 48, i32 17, i32 49, i32 18, i32 50, i32 19, i32 51, i32 20, i32 52, i32 21, i32 53, i32 22, i32 54, i32 23, i32 55>
%res1 = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %res0, <32 x i8> <i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22>, <32 x i8> %res0, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %res0, <32 x i8> <i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22>, <32 x i8> %res0, i32 -1)
ret <32 x i8> %res1		ret <32 x i8> %res1
}		}
define <64 x i8> @combine_vpermi2var_64i8_as_vpermb(<64 x i8> %x0, <64 x i8> %x1) {		define <64 x i8> @combine_vpermi2var_64i8_as_vpermb(<64 x i8> %x0, <64 x i8> %x1) {
; X32-LABEL: combine_vpermi2var_64i8_as_vpermb:		; X32-LABEL: combine_vpermi2var_64i8_as_vpermb:
Show All 10 Lines	; X64-NEXT: retq
%res0 = shufflevector <64 x i8> %x0, <64 x i8> %x1, <64 x i32> <i32 0, i32 64, i32 1, i32 65, i32 2, i32 66, i32 3, i32 67, i32 4, i32 68, i32 5, i32 69, i32 6, i32 70, i32 7, i32 71, i32 16, i32 80, i32 17, i32 81, i32 18, i32 82, i32 19, i32 83, i32 20, i32 84, i32 21, i32 85, i32 22, i32 86, i32 23, i32 87, i32 32, i32 96, i32 33, i32 97, i32 34, i32 98, i32 35, i32 99, i32 36, i32 100, i32 37, i32 101, i32 38, i32 102, i32 39, i32 103, i32 48, i32 112, i32 49, i32 113, i32 50, i32 114, i32 51, i32 115, i32 52, i32 116, i32 53, i32 117, i32 54, i32 118, i32 55, i32 119>		%res0 = shufflevector <64 x i8> %x0, <64 x i8> %x1, <64 x i32> <i32 0, i32 64, i32 1, i32 65, i32 2, i32 66, i32 3, i32 67, i32 4, i32 68, i32 5, i32 69, i32 6, i32 70, i32 7, i32 71, i32 16, i32 80, i32 17, i32 81, i32 18, i32 82, i32 19, i32 83, i32 20, i32 84, i32 21, i32 85, i32 22, i32 86, i32 23, i32 87, i32 32, i32 96, i32 33, i32 97, i32 34, i32 98, i32 35, i32 99, i32 36, i32 100, i32 37, i32 101, i32 38, i32 102, i32 39, i32 103, i32 48, i32 112, i32 49, i32 113, i32 50, i32 114, i32 51, i32 115, i32 52, i32 116, i32 53, i32 117, i32 54, i32 118, i32 55, i32 119>
%res1 = call <64 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.512(<64 x i8> %res0, <64 x i8> <i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22>, <64 x i8> %res0, i64 -1)		%res1 = call <64 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.512(<64 x i8> %res0, <64 x i8> <i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22>, <64 x i8> %res0, i64 -1)
ret <64 x i8> %res1		ret <64 x i8> %res1
}		}

define <16 x i8> @combine_vpermt2var_vpermi2var_16i8_as_vperm2(<16 x i8> %x0, <16 x i8> %x1) {		define <16 x i8> @combine_vpermt2var_vpermi2var_16i8_as_vperm2(<16 x i8> %x0, <16 x i8> %x1) {
; X32-LABEL: combine_vpermt2var_vpermi2var_16i8_as_vperm2:		; X32-LABEL: combine_vpermt2var_vpermi2var_16i8_as_vperm2:
; X32: # BB#0:		; X32: # BB#0:
; X32-NEXT: vmovdqu8 {{.*#+}} xmm2 = [0,31,2,29,4,27,6,25,8,23,10,21,12,19,14,17]		; X32-NEXT: vmovdqu {{.*#+}} xmm2 = [0,31,2,29,4,27,6,25,8,23,10,21,12,19,14,17]
; X32-NEXT: vpermi2b %xmm1, %xmm0, %xmm2		; X32-NEXT: vpermi2b %xmm1, %xmm0, %xmm2
; X32-NEXT: vmovdqu8 {{.*#+}} xmm0 = [0,17,2,18,4,19,6,21,8,23,10,25,12,27,14,29]		; X32-NEXT: vmovdqu {{.*#+}} xmm0 = [0,17,2,18,4,19,6,21,8,23,10,25,12,27,14,29]
; X32-NEXT: vpermi2b %xmm2, %xmm2, %xmm0		; X32-NEXT: vpermi2b %xmm2, %xmm2, %xmm0
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: combine_vpermt2var_vpermi2var_16i8_as_vperm2:		; X64-LABEL: combine_vpermt2var_vpermi2var_16i8_as_vperm2:
; X64: # BB#0:		; X64: # BB#0:
; X64-NEXT: vmovdqu8 {{.*#+}} xmm2 = [0,31,2,29,4,27,6,25,8,23,10,21,12,19,14,17]		; X64-NEXT: vmovdqu {{.*#+}} xmm2 = [0,31,2,29,4,27,6,25,8,23,10,21,12,19,14,17]
; X64-NEXT: vpermi2b %xmm1, %xmm0, %xmm2		; X64-NEXT: vpermi2b %xmm1, %xmm0, %xmm2
; X64-NEXT: vmovdqu8 {{.*#+}} xmm0 = [0,17,2,18,4,19,6,21,8,23,10,25,12,27,14,29]		; X64-NEXT: vmovdqu {{.*#+}} xmm0 = [0,17,2,18,4,19,6,21,8,23,10,25,12,27,14,29]
; X64-NEXT: vpermi2b %xmm2, %xmm2, %xmm0		; X64-NEXT: vpermi2b %xmm2, %xmm2, %xmm0
; X64-NEXT: retq		; X64-NEXT: retq
%res0 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> <i8 0, i8 31, i8 2, i8 29, i8 4, i8 27, i8 6, i8 25, i8 8, i8 23, i8 10, i8 21, i8 12, i8 19, i8 14, i8 17>, <16 x i8> %x1, i16 -1)		%res0 = call <16 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.128(<16 x i8> %x0, <16 x i8> <i8 0, i8 31, i8 2, i8 29, i8 4, i8 27, i8 6, i8 25, i8 8, i8 23, i8 10, i8 21, i8 12, i8 19, i8 14, i8 17>, <16 x i8> %x1, i16 -1)
%res1 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 0, i8 17, i8 2, i8 18, i8 4, i8 19, i8 6, i8 21, i8 8, i8 23, i8 10, i8 25, i8 12, i8 27, i8 14, i8 29>, <16 x i8> %res0, <16 x i8> %res0, i16 -1)		%res1 = call <16 x i8> @llvm.x86.avx512.maskz.vpermt2var.qi.128(<16 x i8> <i8 0, i8 17, i8 2, i8 18, i8 4, i8 19, i8 6, i8 21, i8 8, i8 23, i8 10, i8 25, i8 12, i8 27, i8 14, i8 29>, <16 x i8> %res0, <16 x i8> %res0, i16 -1)
ret <16 x i8> %res1		ret <16 x i8> %res1
}		}
define <32 x i8> @combine_vpermi2var_32i8_as_vperm2(<32 x i8> %x0, <32 x i8> %x1) {		define <32 x i8> @combine_vpermi2var_32i8_as_vperm2(<32 x i8> %x0, <32 x i8> %x1) {
; X32-LABEL: combine_vpermi2var_32i8_as_vperm2:		; X32-LABEL: combine_vpermi2var_32i8_as_vperm2:
; X32: # BB#0:		; X32: # BB#0:
; X32-NEXT: vmovdqu8 {{.*#+}} ymm2 = [0,32,1,23,2,22,3,21,4,22,5,21,6,20,7,19,0,32,1,23,2,22,3,21,4,22,5,21,6,20,7,19]		; X32-NEXT: vmovdqu {{.*#+}} ymm2 = [0,32,1,23,2,22,3,21,4,22,5,21,6,20,7,19,0,32,1,23,2,22,3,21,4,22,5,21,6,20,7,19]
; X32-NEXT: vpermt2b %ymm1, %ymm2, %ymm0		; X32-NEXT: vpermt2b %ymm1, %ymm2, %ymm0
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: combine_vpermi2var_32i8_as_vperm2:		; X64-LABEL: combine_vpermi2var_32i8_as_vperm2:
; X64: # BB#0:		; X64: # BB#0:
; X64-NEXT: vmovdqu8 {{.*#+}} ymm2 = [0,32,1,23,2,22,3,21,4,22,5,21,6,20,7,19,0,32,1,23,2,22,3,21,4,22,5,21,6,20,7,19]		; X64-NEXT: vmovdqu {{.*#+}} ymm2 = [0,32,1,23,2,22,3,21,4,22,5,21,6,20,7,19,0,32,1,23,2,22,3,21,4,22,5,21,6,20,7,19]
; X64-NEXT: vpermt2b %ymm1, %ymm2, %ymm0		; X64-NEXT: vpermt2b %ymm1, %ymm2, %ymm0
; X64-NEXT: retq		; X64-NEXT: retq
%res0 = shufflevector <32 x i8> %x0, <32 x i8> %x1, <32 x i32> <i32 0, i32 32, i32 1, i32 33, i32 2, i32 34, i32 3, i32 35, i32 4, i32 36, i32 5, i32 37, i32 6, i32 38, i32 7, i32 39, i32 16, i32 48, i32 17, i32 49, i32 18, i32 50, i32 19, i32 51, i32 20, i32 52, i32 21, i32 53, i32 22, i32 54, i32 23, i32 55>		%res0 = shufflevector <32 x i8> %x0, <32 x i8> %x1, <32 x i32> <i32 0, i32 32, i32 1, i32 33, i32 2, i32 34, i32 3, i32 35, i32 4, i32 36, i32 5, i32 37, i32 6, i32 38, i32 7, i32 39, i32 16, i32 48, i32 17, i32 49, i32 18, i32 50, i32 19, i32 51, i32 20, i32 52, i32 21, i32 53, i32 22, i32 54, i32 23, i32 55>
%res1 = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %res0, <32 x i8> <i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22>, <32 x i8> %x1, i32 -1)		%res1 = call <32 x i8> @llvm.x86.avx512.mask.vpermi2var.qi.256(<32 x i8> %res0, <32 x i8> <i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22, i8 0, i8 32, i8 2, i8 30, i8 4, i8 28, i8 6, i8 26, i8 8, i8 28, i8 10, i8 26, i8 12, i8 24, i8 14, i8 22>, <32 x i8> %x1, i32 -1)
ret <32 x i8> %res1		ret <32 x i8> %res1
}		}
define <64 x i8> @combine_vpermi2var_64i8_as_vperm2(<64 x i8> %x0, <64 x i8> %x1) {		define <64 x i8> @combine_vpermi2var_64i8_as_vperm2(<64 x i8> %x0, <64 x i8> %x1) {
; X32-LABEL: combine_vpermi2var_64i8_as_vperm2:		; X32-LABEL: combine_vpermi2var_64i8_as_vperm2:
Show All 14 Lines

llvm/trunk/test/CodeGen/X86/vector-shuffle-masked.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx512vl,+avx512dq \| FileCheck %s --check-prefix=CHECK		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx512vl,+avx512dq \| FileCheck %s --check-prefix=CHECK

define <4 x i32> @mask_shuffle_v4i32_1234(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passthru, i8 %mask) {		define <4 x i32> @mask_shuffle_v4i32_1234(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passthru, i8 %mask) {
; CHECK-LABEL: mask_shuffle_v4i32_1234:		; CHECK-LABEL: mask_shuffle_v4i32_1234:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignd {{.*#+}} xmm2 {%k1} = xmm0[1,2,3],xmm1[0]		; CHECK-NEXT: valignd {{.*#+}} xmm2 {%k1} = xmm0[1,2,3],xmm1[0]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0		; CHECK-NEXT: vmovdqa %xmm2, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 1, i32 2, i32 3, i32 4>		%shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 1, i32 2, i32 3, i32 4>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%res = select <4 x i1> %mask.extract, <4 x i32> %shuffle, <4 x i32> %passthru		%res = select <4 x i1> %mask.extract, <4 x i32> %shuffle, <4 x i32> %passthru
ret <4 x i32> %res		ret <4 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @mask_shuffle_v4i32_2345(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passthru, i8 %mask) {		define <4 x i32> @mask_shuffle_v4i32_2345(<4 x i32> %a, <4 x i32> %b, <4 x i32> %passthru, i8 %mask) {
; CHECK-LABEL: mask_shuffle_v4i32_2345:		; CHECK-LABEL: mask_shuffle_v4i32_2345:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignd {{.*#+}} xmm2 {%k1} = xmm0[2,3],xmm1[0,1]		; CHECK-NEXT: valignd {{.*#+}} xmm2 {%k1} = xmm0[2,3],xmm1[0,1]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0		; CHECK-NEXT: vmovdqa %xmm2, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 2, i32 3, i32 4, i32 5>		%shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 2, i32 3, i32 4, i32 5>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%res = select <4 x i1> %mask.extract, <4 x i32> %shuffle, <4 x i32> %passthru		%res = select <4 x i1> %mask.extract, <4 x i32> %shuffle, <4 x i32> %passthru
ret <4 x i32> %res		ret <4 x i32> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <2 x i64> @mask_shuffle_v2i64_12(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passthru, i8 %mask) {		define <2 x i64> @mask_shuffle_v2i64_12(<2 x i64> %a, <2 x i64> %b, <2 x i64> %passthru, i8 %mask) {
; CHECK-LABEL: mask_shuffle_v2i64_12:		; CHECK-LABEL: mask_shuffle_v2i64_12:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignq {{.*#+}} xmm2 {%k1} = xmm0[1],xmm1[0]		; CHECK-NEXT: valignq {{.*#+}} xmm2 {%k1} = xmm0[1],xmm1[0]
; CHECK-NEXT: vmovdqa64 %xmm2, %xmm0		; CHECK-NEXT: vmovdqa %xmm2, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <2 x i64> %a, <2 x i64> %b, <2 x i32> <i32 1, i32 2>		%shuffle = shufflevector <2 x i64> %a, <2 x i64> %b, <2 x i32> <i32 1, i32 2>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <2 x i32> <i32 0, i32 1>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <2 x i32> <i32 0, i32 1>
%res = select <2 x i1> %mask.extract, <2 x i64> %shuffle, <2 x i64> %passthru		%res = select <2 x i1> %mask.extract, <2 x i64> %shuffle, <2 x i64> %passthru
ret <2 x i64> %res		ret <2 x i64> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define <4 x i64> @mask_shuffle_v4i64_1234(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passthru, i8 %mask) {		define <4 x i64> @mask_shuffle_v4i64_1234(<4 x i64> %a, <4 x i64> %b, <4 x i64> %passthru, i8 %mask) {
; CHECK-LABEL: mask_shuffle_v4i64_1234:		; CHECK-LABEL: mask_shuffle_v4i64_1234:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignq {{.*#+}} ymm2 {%k1} = ymm0[1,2,3],ymm1[0]		; CHECK-NEXT: valignq {{.*#+}} ymm2 {%k1} = ymm0[1,2,3],ymm1[0]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0		; CHECK-NEXT: vmovdqa %ymm2, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 2, i32 3, i32 4>		%shuffle = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 2, i32 3, i32 4>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%res = select <4 x i1> %mask.extract, <4 x i64> %shuffle, <4 x i64> %passthru		%res = select <4 x i1> %mask.extract, <4 x i64> %shuffle, <4 x i64> %passthru
ret <4 x i64> %res		ret <4 x i64> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <4 x i64> @mask_shuffle_v4i64_1230(<4 x i64> %a, <4 x i64> %passthru, i8 %mask) {		define <4 x i64> @mask_shuffle_v4i64_1230(<4 x i64> %a, <4 x i64> %passthru, i8 %mask) {
; CHECK-LABEL: mask_shuffle_v4i64_1230:		; CHECK-LABEL: mask_shuffle_v4i64_1230:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: vpermq {{.*#+}} ymm1 {%k1} = ymm0[1,2,3,0]		; CHECK-NEXT: vpermq {{.*#+}} ymm1 {%k1} = ymm0[1,2,3,0]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0		; CHECK-NEXT: vmovdqa %ymm1, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <4 x i64> %a, <4 x i64> undef, <4 x i32> <i32 1, i32 2, i32 3, i32 0>		%shuffle = shufflevector <4 x i64> %a, <4 x i64> undef, <4 x i32> <i32 1, i32 2, i32 3, i32 0>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		%mask.extract = shufflevector <8 x i1> %mask.cast, <8 x i1> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%res = select <4 x i1> %mask.extract, <4 x i64> %shuffle, <4 x i64> %passthru		%res = select <4 x i1> %mask.extract, <4 x i64> %shuffle, <4 x i64> %passthru
ret <4 x i64> %res		ret <4 x i64> %res
}		}

Show All 10 Lines	; CHECK-NEXT: retq
ret <4 x i64> %res		ret <4 x i64> %res
}		}

define <8 x i32> @mask_shuffle_v8i32_12345678(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passthru, i8 %mask) {		define <8 x i32> @mask_shuffle_v8i32_12345678(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passthru, i8 %mask) {
; CHECK-LABEL: mask_shuffle_v8i32_12345678:		; CHECK-LABEL: mask_shuffle_v8i32_12345678:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignd {{.*#+}} ymm2 {%k1} = ymm0[1,2,3,4,5,6,7],ymm1[0]		; CHECK-NEXT: valignd {{.*#+}} ymm2 {%k1} = ymm0[1,2,3,4,5,6,7],ymm1[0]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0		; CHECK-NEXT: vmovdqa %ymm2, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8>		%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> %passthru		%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> %passthru
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @maskz_shuffle_v8i32_12345678(<8 x i32> %a, <8 x i32> %b, i8 %mask) {		define <8 x i32> @maskz_shuffle_v8i32_12345678(<8 x i32> %a, <8 x i32> %b, i8 %mask) {
; CHECK-LABEL: maskz_shuffle_v8i32_12345678:		; CHECK-LABEL: maskz_shuffle_v8i32_12345678:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignd {{.*#+}} ymm0 {%k1} {z} = ymm0[1,2,3,4,5,6,7],ymm1[0]		; CHECK-NEXT: valignd {{.*#+}} ymm0 {%k1} {z} = ymm0[1,2,3,4,5,6,7],ymm1[0]
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8>		%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> zeroinitializer		%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> zeroinitializer
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @mask_shuffle_v8i32_23456789(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passthru, i8 %mask) {		define <8 x i32> @mask_shuffle_v8i32_23456789(<8 x i32> %a, <8 x i32> %b, <8 x i32> %passthru, i8 %mask) {
; CHECK-LABEL: mask_shuffle_v8i32_23456789:		; CHECK-LABEL: mask_shuffle_v8i32_23456789:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignd {{.*#+}} ymm2 {%k1} = ymm0[2,3,4,5,6,7],ymm1[0,1]		; CHECK-NEXT: valignd {{.*#+}} ymm2 {%k1} = ymm0[2,3,4,5,6,7],ymm1[0,1]
; CHECK-NEXT: vmovdqa64 %ymm2, %ymm0		; CHECK-NEXT: vmovdqa %ymm2, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9>		%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> %passthru		%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> %passthru
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @maskz_shuffle_v8i32_23456789(<8 x i32> %a, <8 x i32> %b, i8 %mask) {		define <8 x i32> @maskz_shuffle_v8i32_23456789(<8 x i32> %a, <8 x i32> %b, i8 %mask) {
; CHECK-LABEL: maskz_shuffle_v8i32_23456789:		; CHECK-LABEL: maskz_shuffle_v8i32_23456789:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignd {{.*#+}} ymm0 {%k1} {z} = ymm0[2,3,4,5,6,7],ymm1[0,1]		; CHECK-NEXT: valignd {{.*#+}} ymm0 {%k1} {z} = ymm0[2,3,4,5,6,7],ymm1[0,1]
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9>		%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> zeroinitializer		%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> zeroinitializer
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @mask_shuffle_v8i32_12345670(<8 x i32> %a, <8 x i32> %passthru, i8 %mask) {		define <8 x i32> @mask_shuffle_v8i32_12345670(<8 x i32> %a, <8 x i32> %passthru, i8 %mask) {
; CHECK-LABEL: mask_shuffle_v8i32_12345670:		; CHECK-LABEL: mask_shuffle_v8i32_12345670:
; CHECK: # BB#0:		; CHECK: # BB#0:
; CHECK-NEXT: kmovb %edi, %k1		; CHECK-NEXT: kmovb %edi, %k1
; CHECK-NEXT: valignd {{.*#+}} ymm1 {%k1} = ymm0[1,2,3,4,5,6,7,0]		; CHECK-NEXT: valignd {{.*#+}} ymm1 {%k1} = ymm0[1,2,3,4,5,6,7,0]
; CHECK-NEXT: vmovdqa64 %ymm1, %ymm0		; CHECK-NEXT: vmovdqa %ymm1, %ymm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%shuffle = shufflevector <8 x i32> %a, <8 x i32> undef, <8 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0>		%shuffle = shufflevector <8 x i32> %a, <8 x i32> undef, <8 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 0>
%mask.cast = bitcast i8 %mask to <8 x i1>		%mask.cast = bitcast i8 %mask to <8 x i1>
%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> %passthru		%res = select <8 x i1> %mask.cast, <8 x i32> %shuffle, <8 x i32> %passthru
ret <8 x i32> %res		ret <8 x i32> %res
}		}

define <8 x i32> @maskz_shuffle_v8i32_12345670(<8 x i32> %a, i8 %mask) {		define <8 x i32> @maskz_shuffle_v8i32_12345670(<8 x i32> %a, i8 %mask) {
Show All 36 Lines

llvm/trunk/test/CodeGen/X86/vector-trunc.ll

	Show First 20 Lines • Show All 529 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vpmovdb %zmm0, %xmm0			; AVX512F-NEXT: vpmovdb %zmm0, %xmm0
	; AVX512F-NEXT: vmovdqu %xmm0, (%rax)			; AVX512F-NEXT: vmovdqu %xmm0, (%rax)
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: trunc16i16_16i8:			; AVX512VL-LABEL: trunc16i16_16i8:
	; AVX512VL: # BB#0: # %entry			; AVX512VL: # BB#0: # %entry
	; AVX512VL-NEXT: vpmovsxwd %ymm0, %zmm0			; AVX512VL-NEXT: vpmovsxwd %ymm0, %zmm0
	; AVX512VL-NEXT: vpmovdb %zmm0, %xmm0			; AVX512VL-NEXT: vpmovdb %zmm0, %xmm0
	; AVX512VL-NEXT: vmovdqu32 %xmm0, (%rax)			; AVX512VL-NEXT: vmovdqu %xmm0, (%rax)
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512BW-LABEL: trunc16i16_16i8:			; AVX512BW-LABEL: trunc16i16_16i8:
	; AVX512BW: # BB#0: # %entry			; AVX512BW: # BB#0: # %entry
	; AVX512BW-NEXT: # kill: %YMM0<def> %YMM0<kill> %ZMM0<def>			; AVX512BW-NEXT: # kill: %YMM0<def> %YMM0<kill> %ZMM0<def>
	; AVX512BW-NEXT: vpmovwb %zmm0, %ymm0			; AVX512BW-NEXT: vpmovwb %zmm0, %ymm0
	; AVX512BW-NEXT: vmovdqu %xmm0, (%rax)			; AVX512BW-NEXT: vmovdqu %xmm0, (%rax)
	; AVX512BW-NEXT: retq			; AVX512BW-NEXT: retq
	▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
	;			;
	; AVX512VL-LABEL: trunc32i16_32i8:			; AVX512VL-LABEL: trunc32i16_32i8:
	; AVX512VL: # BB#0: # %entry			; AVX512VL: # BB#0: # %entry
	; AVX512VL-NEXT: vpmovsxwd %ymm0, %zmm0			; AVX512VL-NEXT: vpmovsxwd %ymm0, %zmm0
	; AVX512VL-NEXT: vpmovdb %zmm0, %xmm0			; AVX512VL-NEXT: vpmovdb %zmm0, %xmm0
	; AVX512VL-NEXT: vpmovsxwd %ymm1, %zmm1			; AVX512VL-NEXT: vpmovsxwd %ymm1, %zmm1
	; AVX512VL-NEXT: vpmovdb %zmm1, %xmm1			; AVX512VL-NEXT: vpmovdb %zmm1, %xmm1
	; AVX512VL-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm0			; AVX512VL-NEXT: vinserti32x4 $1, %xmm1, %ymm0, %ymm0
	; AVX512VL-NEXT: vmovdqu32 %ymm0, (%rax)			; AVX512VL-NEXT: vmovdqu %ymm0, (%rax)
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512BW-LABEL: trunc32i16_32i8:			; AVX512BW-LABEL: trunc32i16_32i8:
	; AVX512BW: # BB#0: # %entry			; AVX512BW: # BB#0: # %entry
	; AVX512BW-NEXT: vpmovwb %zmm0, (%rax)			; AVX512BW-NEXT: vpmovwb %zmm0, (%rax)
	; AVX512BW-NEXT: retq			; AVX512BW-NEXT: retq
	;			;
	; AVX512BWVL-LABEL: trunc32i16_32i8:			; AVX512BWVL-LABEL: trunc32i16_32i8:
	▲ Show 20 Lines • Show All 439 Lines • ▼ Show 20 Lines
	; AVX512F-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>			; AVX512F-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>
	; AVX512F-NEXT: vpshufb %xmm2, %xmm1, %xmm1			; AVX512F-NEXT: vpshufb %xmm2, %xmm1, %xmm1
	; AVX512F-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512F-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512F-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512F-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: trunc2x8i16_16i8:			; AVX512VL-LABEL: trunc2x8i16_16i8:
	; AVX512VL: # BB#0: # %entry			; AVX512VL: # BB#0: # %entry
	; AVX512VL-NEXT: vmovdqa64 {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>			; AVX512VL-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm1			; AVX512VL-NEXT: vpshufb %xmm2, %xmm1, %xmm1
	; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512VL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512VL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512VL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512BW-LABEL: trunc2x8i16_16i8:			; AVX512BW-LABEL: trunc2x8i16_16i8:
	; AVX512BW: # BB#0: # %entry			; AVX512BW: # BB#0: # %entry
	; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>			; AVX512BW-NEXT: vmovdqa {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>
	; AVX512BW-NEXT: vpshufb %xmm2, %xmm1, %xmm1			; AVX512BW-NEXT: vpshufb %xmm2, %xmm1, %xmm1
	; AVX512BW-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512BW-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512BW-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512BW-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512BW-NEXT: retq			; AVX512BW-NEXT: retq
	;			;
	; AVX512BWVL-LABEL: trunc2x8i16_16i8:			; AVX512BWVL-LABEL: trunc2x8i16_16i8:
	; AVX512BWVL: # BB#0: # %entry			; AVX512BWVL: # BB#0: # %entry
	; AVX512BWVL-NEXT: vmovdqu8 {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>			; AVX512BWVL-NEXT: vmovdqu {{.*#+}} xmm2 = <0,2,4,6,8,10,12,14,u,u,u,u,u,u,u,u>
	; AVX512BWVL-NEXT: vpshufb %xmm2, %xmm1, %xmm1			; AVX512BWVL-NEXT: vpshufb %xmm2, %xmm1, %xmm1
	; AVX512BWVL-NEXT: vpshufb %xmm2, %xmm0, %xmm0			; AVX512BWVL-NEXT: vpshufb %xmm2, %xmm0, %xmm0
	; AVX512BWVL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; AVX512BWVL-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; AVX512BWVL-NEXT: retq			; AVX512BWVL-NEXT: retq
	entry:			entry:
	%0 = trunc <8 x i16> %a to <8 x i8>			%0 = trunc <8 x i16> %a to <8 x i8>
	%1 = trunc <8 x i16> %b to <8 x i8>			%1 = trunc <8 x i16> %b to <8 x i8>
	%2 = shufflevector <8 x i8> %0, <8 x i8> %1, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>			%2 = shufflevector <8 x i8> %0, <8 x i8> %1, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
	▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	;			;
	; AVX512F-LABEL: trunc16i64_16i8_const:			; AVX512F-LABEL: trunc16i64_16i8_const:
	; AVX512F: # BB#0: # %entry			; AVX512F: # BB#0: # %entry
	; AVX512F-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX512F-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX512F-NEXT: retq			; AVX512F-NEXT: retq
	;			;
	; AVX512VL-LABEL: trunc16i64_16i8_const:			; AVX512VL-LABEL: trunc16i64_16i8_const:
	; AVX512VL: # BB#0: # %entry			; AVX512VL: # BB#0: # %entry
	; AVX512VL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512VL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512VL-NEXT: retq			; AVX512VL-NEXT: retq
	;			;
	; AVX512BW-LABEL: trunc16i64_16i8_const:			; AVX512BW-LABEL: trunc16i64_16i8_const:
	; AVX512BW: # BB#0: # %entry			; AVX512BW: # BB#0: # %entry
	; AVX512BW-NEXT: vxorps %xmm0, %xmm0, %xmm0			; AVX512BW-NEXT: vxorps %xmm0, %xmm0, %xmm0
	; AVX512BW-NEXT: retq			; AVX512BW-NEXT: retq
	;			;
	; AVX512BWVL-LABEL: trunc16i64_16i8_const:			; AVX512BWVL-LABEL: trunc16i64_16i8_const:
	; AVX512BWVL: # BB#0: # %entry			; AVX512BWVL: # BB#0: # %entry
	; AVX512BWVL-NEXT: vpxord %xmm0, %xmm0, %xmm0			; AVX512BWVL-NEXT: vpxor %xmm0, %xmm0, %xmm0
	; AVX512BWVL-NEXT: retq			; AVX512BWVL-NEXT: retq

	entry:			entry:
	%0 = trunc <16 x i64> zeroinitializer to <16 x i8>			%0 = trunc <16 x i64> zeroinitializer to <16 x i8>
	%1 = shufflevector <16 x i8> %0, <16 x i8> %0, <16 x i32> <i32 28, i32 30, i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 undef, i32 14, i32 16, i32 18, i32 20, i32 22, i32 24, i32 26>			%1 = shufflevector <16 x i8> %0, <16 x i8> %0, <16 x i32> <i32 28, i32 30, i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 undef, i32 14, i32 16, i32 18, i32 20, i32 22, i32 24, i32 26>
	ret <16 x i8> %1			ret <16 x i8> %1
	}			}

llvm/trunk/test/CodeGen/X86/vector-tzcnt-128.ll

	Show First 20 Lines • Show All 130 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpand %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpand %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vpshufb %xmm0, %xmm4, %xmm0			; AVX2-NEXT: vpshufb %xmm0, %xmm4, %xmm0
	; AVX2-NEXT: vpaddb %xmm3, %xmm0, %xmm0			; AVX2-NEXT: vpaddb %xmm3, %xmm0, %xmm0
	; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv2i64:			; AVX512CDVL-LABEL: testv2i64:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpsubq %xmm0, %xmm1, %xmm2			; AVX512CDVL-NEXT: vpsubq %xmm0, %xmm1, %xmm2
	; AVX512CDVL-NEXT: vpandq %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsubq {{.*}}(%rip), %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsubq {{.*}}(%rip), %xmm0, %xmm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm2 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm2 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %xmm2, %xmm0, %xmm3			; AVX512CDVL-NEXT: vpand %xmm2, %xmm0, %xmm3
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm4 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm4 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %xmm3, %xmm4, %xmm3			; AVX512CDVL-NEXT: vpshufb %xmm3, %xmm4, %xmm3
	; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpandq %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm4, %xmm0			; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm4, %xmm0
	; AVX512CDVL-NEXT: vpaddb %xmm3, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpaddb %xmm3, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsadbw %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsadbw %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv2i64:			; AVX512CD-LABEL: testv2i64:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1
	▲ Show 20 Lines • Show All 154 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpand %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpand %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vpshufb %xmm0, %xmm4, %xmm0			; AVX2-NEXT: vpshufb %xmm0, %xmm4, %xmm0
	; AVX2-NEXT: vpaddb %xmm3, %xmm0, %xmm0			; AVX2-NEXT: vpaddb %xmm3, %xmm0, %xmm0
	; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv2i64u:			; AVX512CDVL-LABEL: testv2i64u:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpsubq %xmm0, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpsubq %xmm0, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vplzcntq %xmm0, %xmm0			; AVX512CDVL-NEXT: vplzcntq %xmm0, %xmm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm1 = [63,63]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm1 = [63,63]
	; AVX512CDVL-NEXT: vpsubq %xmm0, %xmm1, %xmm0			; AVX512CDVL-NEXT: vpsubq %xmm0, %xmm1, %xmm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv2i64u:			; AVX512CD-LABEL: testv2i64u:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CD-NEXT: vpsubq %xmm0, %xmm1, %xmm1			; AVX512CD-NEXT: vpsubq %xmm0, %xmm1, %xmm1
	; AVX512CD-NEXT: vpand %xmm1, %xmm0, %xmm0			; AVX512CD-NEXT: vpand %xmm1, %xmm0, %xmm0
	▲ Show 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpsadbw %xmm1, %xmm2, %xmm2			; AVX2-NEXT: vpsadbw %xmm1, %xmm2, %xmm2
	; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero			; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero
	; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: vpackuswb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpackuswb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv4i32:			; AVX512CDVL-LABEL: testv4i32:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpsubd %xmm0, %xmm1, %xmm2			; AVX512CDVL-NEXT: vpsubd %xmm0, %xmm1, %xmm2
	; AVX512CDVL-NEXT: vpandd %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsubd {{.*}}(%rip){1to4}, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsubd {{.*}}(%rip){1to4}, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm2 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm2 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %xmm2, %xmm0, %xmm3			; AVX512CDVL-NEXT: vpand %xmm2, %xmm0, %xmm3
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm4 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm4 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %xmm3, %xmm4, %xmm3			; AVX512CDVL-NEXT: vpshufb %xmm3, %xmm4, %xmm3
	; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpandq %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm4, %xmm0			; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm4, %xmm0
	; AVX512CDVL-NEXT: vpaddb %xmm3, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpaddb %xmm3, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpunpckhdq {{.*#+}} xmm2 = xmm0[2],xmm1[2],xmm0[3],xmm1[3]			; AVX512CDVL-NEXT: vpunpckhdq {{.*#+}} xmm2 = xmm0[2],xmm1[2],xmm0[3],xmm1[3]
	; AVX512CDVL-NEXT: vpsadbw %xmm1, %xmm2, %xmm2			; AVX512CDVL-NEXT: vpsadbw %xmm1, %xmm2, %xmm2
	; AVX512CDVL-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; AVX512CDVL-NEXT: vpunpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; AVX512CDVL-NEXT: vpsadbw %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsadbw %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpackuswb %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpackuswb %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	▲ Show 20 Lines • Show All 195 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpsadbw %xmm1, %xmm2, %xmm2			; AVX2-NEXT: vpsadbw %xmm1, %xmm2, %xmm2
	; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero			; AVX2-NEXT: vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero
	; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpsadbw %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: vpackuswb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpackuswb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv4i32u:			; AVX512CDVL-LABEL: testv4i32u:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpsubd %xmm0, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpsubd %xmm0, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpandd %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vplzcntd %xmm0, %xmm0			; AVX512CDVL-NEXT: vplzcntd %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpbroadcastd {{.*}}(%rip), %xmm1			; AVX512CDVL-NEXT: vpbroadcastd {{.*}}(%rip), %xmm1
	; AVX512CDVL-NEXT: vpsubd %xmm0, %xmm1, %xmm0			; AVX512CDVL-NEXT: vpsubd %xmm0, %xmm1, %xmm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv4i32u:			; AVX512CD-LABEL: testv4i32u:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1
	▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpaddb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vpsllw $8, %xmm0, %xmm1			; AVX2-NEXT: vpsllw $8, %xmm0, %xmm1
	; AVX2-NEXT: vpaddb %xmm0, %xmm1, %xmm0			; AVX2-NEXT: vpaddb %xmm0, %xmm1, %xmm0
	; AVX2-NEXT: vpsrlw $8, %xmm0, %xmm0			; AVX2-NEXT: vpsrlw $8, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv8i16:			; AVX512CDVL-LABEL: testv8i16:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpsubw %xmm0, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpsubw %xmm0, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsubw {{.*}}(%rip), %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsubw {{.*}}(%rip), %xmm0, %xmm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm2			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm2
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %xmm2, %xmm3, %xmm2			; AVX512CDVL-NEXT: vpshufb %xmm2, %xmm3, %xmm2
	; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm3, %xmm0			; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm3, %xmm0
	; AVX512CDVL-NEXT: vpaddb %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsllw $8, %xmm0, %xmm1			; AVX512CDVL-NEXT: vpsllw $8, %xmm0, %xmm1
	; AVX512CDVL-NEXT: vpaddb %xmm0, %xmm1, %xmm0			; AVX512CDVL-NEXT: vpaddb %xmm0, %xmm1, %xmm0
	; AVX512CDVL-NEXT: vpsrlw $8, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsrlw $8, %xmm0, %xmm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv8i16:			; AVX512CD-LABEL: testv8i16:
	▲ Show 20 Lines • Show All 172 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpaddb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: vpsllw $8, %xmm0, %xmm1			; AVX2-NEXT: vpsllw $8, %xmm0, %xmm1
	; AVX2-NEXT: vpaddb %xmm0, %xmm1, %xmm0			; AVX2-NEXT: vpaddb %xmm0, %xmm1, %xmm0
	; AVX2-NEXT: vpsrlw $8, %xmm0, %xmm0			; AVX2-NEXT: vpsrlw $8, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv8i16u:			; AVX512CDVL-LABEL: testv8i16u:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpsubw %xmm0, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpsubw %xmm0, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsubw {{.*}}(%rip), %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsubw {{.*}}(%rip), %xmm0, %xmm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm2			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm2
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %xmm2, %xmm3, %xmm2			; AVX512CDVL-NEXT: vpshufb %xmm2, %xmm3, %xmm2
	; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm3, %xmm0			; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm3, %xmm0
	; AVX512CDVL-NEXT: vpaddb %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsllw $8, %xmm0, %xmm1			; AVX512CDVL-NEXT: vpsllw $8, %xmm0, %xmm1
	; AVX512CDVL-NEXT: vpaddb %xmm0, %xmm1, %xmm0			; AVX512CDVL-NEXT: vpaddb %xmm0, %xmm1, %xmm0
	; AVX512CDVL-NEXT: vpsrlw $8, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsrlw $8, %xmm0, %xmm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv8i16u:			; AVX512CD-LABEL: testv8i16u:
	▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpsrlw $4, %xmm0, %xmm0			; AVX2-NEXT: vpsrlw $4, %xmm0, %xmm0
	; AVX2-NEXT: vpand %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: vpshufb %xmm0, %xmm3, %xmm0			; AVX2-NEXT: vpshufb %xmm0, %xmm3, %xmm0
	; AVX2-NEXT: vpaddb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv16i8:			; AVX512CDVL-LABEL: testv16i8:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpsubb %xmm0, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpsubb %xmm0, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsubb {{.*}}(%rip), %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsubb {{.*}}(%rip), %xmm0, %xmm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm2			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm2
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %xmm2, %xmm3, %xmm2			; AVX512CDVL-NEXT: vpshufb %xmm2, %xmm3, %xmm2
	; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm3, %xmm0			; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm3, %xmm0
	; AVX512CDVL-NEXT: vpaddb %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv16i8:			; AVX512CD-LABEL: testv16i8:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CD-NEXT: vpsubb %xmm0, %xmm1, %xmm1			; AVX512CD-NEXT: vpsubb %xmm0, %xmm1, %xmm1
	▲ Show 20 Lines • Show All 140 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpsrlw $4, %xmm0, %xmm0			; AVX2-NEXT: vpsrlw $4, %xmm0, %xmm0
	; AVX2-NEXT: vpand %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: vpshufb %xmm0, %xmm3, %xmm0			; AVX2-NEXT: vpshufb %xmm0, %xmm3, %xmm0
	; AVX2-NEXT: vpaddb %xmm2, %xmm0, %xmm0			; AVX2-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv16i8u:			; AVX512CDVL-LABEL: testv16i8u:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %xmm1, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpsubb %xmm0, %xmm1, %xmm1			; AVX512CDVL-NEXT: vpsubb %xmm0, %xmm1, %xmm1
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpsubb {{.*}}(%rip), %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsubb {{.*}}(%rip), %xmm0, %xmm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm2			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm2
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} xmm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} xmm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %xmm2, %xmm3, %xmm2			; AVX512CDVL-NEXT: vpshufb %xmm2, %xmm3, %xmm2
	; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpsrlw $4, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpandq %xmm1, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpand %xmm1, %xmm0, %xmm0
	; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm3, %xmm0			; AVX512CDVL-NEXT: vpshufb %xmm0, %xmm3, %xmm0
	; AVX512CDVL-NEXT: vpaddb %xmm2, %xmm0, %xmm0			; AVX512CDVL-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv16i8u:			; AVX512CD-LABEL: testv16i8u:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1			; AVX512CD-NEXT: vpxor %xmm1, %xmm1, %xmm1
	; AVX512CD-NEXT: vpsubb %xmm0, %xmm1, %xmm1			; AVX512CD-NEXT: vpsubb %xmm0, %xmm1, %xmm1
	▲ Show 20 Lines • Show All 225 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vector-tzcnt-256.ll

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpand %ymm2, %ymm0, %ymm0			; AVX2-NEXT: vpand %ymm2, %ymm0, %ymm0
	; AVX2-NEXT: vpshufb %ymm0, %ymm4, %ymm0			; AVX2-NEXT: vpshufb %ymm0, %ymm4, %ymm0
	; AVX2-NEXT: vpaddb %ymm3, %ymm0, %ymm0			; AVX2-NEXT: vpaddb %ymm3, %ymm0, %ymm0
	; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv4i64:			; AVX512CDVL-LABEL: testv4i64:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpsubq %ymm0, %ymm1, %ymm2			; AVX512CDVL-NEXT: vpsubq %ymm0, %ymm1, %ymm2
	; AVX512CDVL-NEXT: vpandq %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsubq {{.*}}(%rip){1to4}, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsubq {{.*}}(%rip){1to4}, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm2 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm2 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %ymm2, %ymm0, %ymm3			; AVX512CDVL-NEXT: vpand %ymm2, %ymm0, %ymm3
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm4 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm4 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %ymm3, %ymm4, %ymm3			; AVX512CDVL-NEXT: vpshufb %ymm3, %ymm4, %ymm3
	; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpandq %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm4, %ymm0			; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm4, %ymm0
	; AVX512CDVL-NEXT: vpaddb %ymm3, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpaddb %ymm3, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsadbw %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsadbw %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv4i64:			; AVX512CD-LABEL: testv4i64:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1			; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1
	▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpand %ymm2, %ymm0, %ymm0			; AVX2-NEXT: vpand %ymm2, %ymm0, %ymm0
	; AVX2-NEXT: vpshufb %ymm0, %ymm4, %ymm0			; AVX2-NEXT: vpshufb %ymm0, %ymm4, %ymm0
	; AVX2-NEXT: vpaddb %ymm3, %ymm0, %ymm0			; AVX2-NEXT: vpaddb %ymm3, %ymm0, %ymm0
	; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv4i64u:			; AVX512CDVL-LABEL: testv4i64u:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpsubq %ymm0, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpsubq %ymm0, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vplzcntq %ymm0, %ymm0			; AVX512CDVL-NEXT: vplzcntq %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpbroadcastq {{.*}}(%rip), %ymm1			; AVX512CDVL-NEXT: vpbroadcastq {{.*}}(%rip), %ymm1
	; AVX512CDVL-NEXT: vpsubq %ymm0, %ymm1, %ymm0			; AVX512CDVL-NEXT: vpsubq %ymm0, %ymm1, %ymm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv4i64u:			; AVX512CD-LABEL: testv4i64u:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1			; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1
	▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpsadbw %ymm1, %ymm2, %ymm2			; AVX2-NEXT: vpsadbw %ymm1, %ymm2, %ymm2
	; AVX2-NEXT: vpunpckldq {{.*#+}} ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]			; AVX2-NEXT: vpunpckldq {{.*#+}} ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
	; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpackuswb %ymm2, %ymm0, %ymm0			; AVX2-NEXT: vpackuswb %ymm2, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv8i32:			; AVX512CDVL-LABEL: testv8i32:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpsubd %ymm0, %ymm1, %ymm2			; AVX512CDVL-NEXT: vpsubd %ymm0, %ymm1, %ymm2
	; AVX512CDVL-NEXT: vpandd %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsubd {{.*}}(%rip){1to8}, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsubd {{.*}}(%rip){1to8}, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm2 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm2 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %ymm2, %ymm0, %ymm3			; AVX512CDVL-NEXT: vpand %ymm2, %ymm0, %ymm3
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm4 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm4 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %ymm3, %ymm4, %ymm3			; AVX512CDVL-NEXT: vpshufb %ymm3, %ymm4, %ymm3
	; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpandq %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm4, %ymm0			; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm4, %ymm0
	; AVX512CDVL-NEXT: vpaddb %ymm3, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpaddb %ymm3, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpunpckhdq {{.*#+}} ymm2 = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]			; AVX512CDVL-NEXT: vpunpckhdq {{.*#+}} ymm2 = ymm0[2],ymm1[2],ymm0[3],ymm1[3],ymm0[6],ymm1[6],ymm0[7],ymm1[7]
	; AVX512CDVL-NEXT: vpsadbw %ymm1, %ymm2, %ymm2			; AVX512CDVL-NEXT: vpsadbw %ymm1, %ymm2, %ymm2
	; AVX512CDVL-NEXT: vpunpckldq {{.*#+}} ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]			; AVX512CDVL-NEXT: vpunpckldq {{.*#+}} ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
	; AVX512CDVL-NEXT: vpsadbw %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsadbw %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpackuswb %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpackuswb %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpsadbw %ymm1, %ymm2, %ymm2			; AVX2-NEXT: vpsadbw %ymm1, %ymm2, %ymm2
	; AVX2-NEXT: vpunpckldq {{.*#+}} ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]			; AVX2-NEXT: vpunpckldq {{.*#+}} ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
	; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpsadbw %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpackuswb %ymm2, %ymm0, %ymm0			; AVX2-NEXT: vpackuswb %ymm2, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv8i32u:			; AVX512CDVL-LABEL: testv8i32u:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpsubd %ymm0, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpsubd %ymm0, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpandd %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vplzcntd %ymm0, %ymm0			; AVX512CDVL-NEXT: vplzcntd %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpbroadcastd {{.*}}(%rip), %ymm1			; AVX512CDVL-NEXT: vpbroadcastd {{.*}}(%rip), %ymm1
	; AVX512CDVL-NEXT: vpsubd %ymm0, %ymm1, %ymm0			; AVX512CDVL-NEXT: vpsubd %ymm0, %ymm1, %ymm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv8i32u:			; AVX512CD-LABEL: testv8i32u:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1			; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1
	▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpaddb %ymm2, %ymm0, %ymm0			; AVX2-NEXT: vpaddb %ymm2, %ymm0, %ymm0
	; AVX2-NEXT: vpsllw $8, %ymm0, %ymm1			; AVX2-NEXT: vpsllw $8, %ymm0, %ymm1
	; AVX2-NEXT: vpaddb %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpaddb %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: vpsrlw $8, %ymm0, %ymm0			; AVX2-NEXT: vpsrlw $8, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv16i16:			; AVX512CDVL-LABEL: testv16i16:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpsubw %ymm0, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpsubw %ymm0, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsubw {{.*}}(%rip), %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsubw {{.*}}(%rip), %ymm0, %ymm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm2			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm2
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %ymm2, %ymm3, %ymm2			; AVX512CDVL-NEXT: vpshufb %ymm2, %ymm3, %ymm2
	; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm3, %ymm0			; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm3, %ymm0
	; AVX512CDVL-NEXT: vpaddb %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpaddb %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsllw $8, %ymm0, %ymm1			; AVX512CDVL-NEXT: vpsllw $8, %ymm0, %ymm1
	; AVX512CDVL-NEXT: vpaddb %ymm0, %ymm1, %ymm0			; AVX512CDVL-NEXT: vpaddb %ymm0, %ymm1, %ymm0
	; AVX512CDVL-NEXT: vpsrlw $8, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsrlw $8, %ymm0, %ymm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv16i16:			; AVX512CD-LABEL: testv16i16:
	▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpaddb %ymm2, %ymm0, %ymm0			; AVX2-NEXT: vpaddb %ymm2, %ymm0, %ymm0
	; AVX2-NEXT: vpsllw $8, %ymm0, %ymm1			; AVX2-NEXT: vpsllw $8, %ymm0, %ymm1
	; AVX2-NEXT: vpaddb %ymm0, %ymm1, %ymm0			; AVX2-NEXT: vpaddb %ymm0, %ymm1, %ymm0
	; AVX2-NEXT: vpsrlw $8, %ymm0, %ymm0			; AVX2-NEXT: vpsrlw $8, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv16i16u:			; AVX512CDVL-LABEL: testv16i16u:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpsubw %ymm0, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpsubw %ymm0, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsubw {{.*}}(%rip), %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsubw {{.*}}(%rip), %ymm0, %ymm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm2			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm2
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %ymm2, %ymm3, %ymm2			; AVX512CDVL-NEXT: vpshufb %ymm2, %ymm3, %ymm2
	; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm3, %ymm0			; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm3, %ymm0
	; AVX512CDVL-NEXT: vpaddb %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpaddb %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsllw $8, %ymm0, %ymm1			; AVX512CDVL-NEXT: vpsllw $8, %ymm0, %ymm1
	; AVX512CDVL-NEXT: vpaddb %ymm0, %ymm1, %ymm0			; AVX512CDVL-NEXT: vpaddb %ymm0, %ymm1, %ymm0
	; AVX512CDVL-NEXT: vpsrlw $8, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsrlw $8, %ymm0, %ymm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv16i16u:			; AVX512CD-LABEL: testv16i16u:
	▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpsrlw $4, %ymm0, %ymm0			; AVX2-NEXT: vpsrlw $4, %ymm0, %ymm0
	; AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpshufb %ymm0, %ymm3, %ymm0			; AVX2-NEXT: vpshufb %ymm0, %ymm3, %ymm0
	; AVX2-NEXT: vpaddb %ymm2, %ymm0, %ymm0			; AVX2-NEXT: vpaddb %ymm2, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv32i8:			; AVX512CDVL-LABEL: testv32i8:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpsubb %ymm0, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpsubb %ymm0, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsubb {{.*}}(%rip), %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsubb {{.*}}(%rip), %ymm0, %ymm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm2			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm2
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %ymm2, %ymm3, %ymm2			; AVX512CDVL-NEXT: vpshufb %ymm2, %ymm3, %ymm2
	; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm3, %ymm0			; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm3, %ymm0
	; AVX512CDVL-NEXT: vpaddb %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpaddb %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv32i8:			; AVX512CD-LABEL: testv32i8:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1			; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CD-NEXT: vpsubb %ymm0, %ymm1, %ymm1			; AVX512CD-NEXT: vpsubb %ymm0, %ymm1, %ymm1
	▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpsrlw $4, %ymm0, %ymm0			; AVX2-NEXT: vpsrlw $4, %ymm0, %ymm0
	; AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpshufb %ymm0, %ymm3, %ymm0			; AVX2-NEXT: vpshufb %ymm0, %ymm3, %ymm0
	; AVX2-NEXT: vpaddb %ymm2, %ymm0, %ymm0			; AVX2-NEXT: vpaddb %ymm2, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512CDVL-LABEL: testv32i8u:			; AVX512CDVL-LABEL: testv32i8u:
	; AVX512CDVL: # BB#0:			; AVX512CDVL: # BB#0:
	; AVX512CDVL-NEXT: vpxord %ymm1, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpsubb %ymm0, %ymm1, %ymm1			; AVX512CDVL-NEXT: vpsubb %ymm0, %ymm1, %ymm1
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpsubb {{.*}}(%rip), %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsubb {{.*}}(%rip), %ymm0, %ymm0
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm1 = [15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15]
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm2			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm2
	; AVX512CDVL-NEXT: vmovdqa64 {{.*#+}} ymm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]			; AVX512CDVL-NEXT: vmovdqa {{.*#+}} ymm3 = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
	; AVX512CDVL-NEXT: vpshufb %ymm2, %ymm3, %ymm2			; AVX512CDVL-NEXT: vpshufb %ymm2, %ymm3, %ymm2
	; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpsrlw $4, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpandq %ymm1, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpand %ymm1, %ymm0, %ymm0
	; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm3, %ymm0			; AVX512CDVL-NEXT: vpshufb %ymm0, %ymm3, %ymm0
	; AVX512CDVL-NEXT: vpaddb %ymm2, %ymm0, %ymm0			; AVX512CDVL-NEXT: vpaddb %ymm2, %ymm0, %ymm0
	; AVX512CDVL-NEXT: retq			; AVX512CDVL-NEXT: retq
	;			;
	; AVX512CD-LABEL: testv32i8u:			; AVX512CD-LABEL: testv32i8u:
	; AVX512CD: # BB#0:			; AVX512CD: # BB#0:
	; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1			; AVX512CD-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; AVX512CD-NEXT: vpsubb %ymm0, %ymm1, %ymm1			; AVX512CD-NEXT: vpsubb %ymm0, %ymm1, %ymm1
	▲ Show 20 Lines • Show All 173 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/viabs.ll

	Show First 20 Lines • Show All 446 Lines • ▼ Show 20 Lines
	; AVX2-NEXT: vpaddq %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpaddq %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: vpxor %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpxor %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_ge_v2i64:			; AVX512-LABEL: test_abs_ge_v2i64:
	; AVX512: # BB#0:			; AVX512: # BB#0:
	; AVX512-NEXT: vpsraq $63, %xmm0, %xmm1			; AVX512-NEXT: vpsraq $63, %xmm0, %xmm1
	; AVX512-NEXT: vpaddq %xmm1, %xmm0, %xmm0			; AVX512-NEXT: vpaddq %xmm1, %xmm0, %xmm0
	; AVX512-NEXT: vpxorq %xmm1, %xmm0, %xmm0			; AVX512-NEXT: vpxor %xmm1, %xmm0, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%tmp1neg = sub <2 x i64> zeroinitializer, %a			%tmp1neg = sub <2 x i64> zeroinitializer, %a
	%b = icmp sge <2 x i64> %a, zeroinitializer			%b = icmp sge <2 x i64> %a, zeroinitializer
	%abs = select <2 x i1> %b, <2 x i64> %a, <2 x i64> %tmp1neg			%abs = select <2 x i1> %b, <2 x i64> %a, <2 x i64> %tmp1neg
	ret <2 x i64> %abs			ret <2 x i64> %abs
	}			}

	define <4 x i64> @test_abs_gt_v4i64(<4 x i64> %a) nounwind {			define <4 x i64> @test_abs_gt_v4i64(<4 x i64> %a) nounwind {
	Show All 32 Lines
	; AVX2-NEXT: vpaddq %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpaddq %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpxor %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpxor %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_gt_v4i64:			; AVX512-LABEL: test_abs_gt_v4i64:
	; AVX512: # BB#0:			; AVX512: # BB#0:
	; AVX512-NEXT: vpsraq $63, %ymm0, %ymm1			; AVX512-NEXT: vpsraq $63, %ymm0, %ymm1
	; AVX512-NEXT: vpaddq %ymm1, %ymm0, %ymm0			; AVX512-NEXT: vpaddq %ymm1, %ymm0, %ymm0
	; AVX512-NEXT: vpxorq %ymm1, %ymm0, %ymm0			; AVX512-NEXT: vpxor %ymm1, %ymm0, %ymm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%tmp1neg = sub <4 x i64> zeroinitializer, %a			%tmp1neg = sub <4 x i64> zeroinitializer, %a
	%b = icmp sgt <4 x i64> %a, <i64 -1, i64 -1, i64 -1, i64 -1>			%b = icmp sgt <4 x i64> %a, <i64 -1, i64 -1, i64 -1, i64 -1>
	%abs = select <4 x i1> %b, <4 x i64> %a, <4 x i64> %tmp1neg			%abs = select <4 x i1> %b, <4 x i64> %a, <4 x i64> %tmp1neg
	ret <4 x i64> %abs			ret <4 x i64> %abs
	}			}

	define <8 x i64> @test_abs_le_v8i64(<8 x i64> %a) nounwind {			define <8 x i64> @test_abs_le_v8i64(<8 x i64> %a) nounwind {
	▲ Show 20 Lines • Show All 292 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86][[AVX512] Code size reduction in X86 by replacing EVEX with VEX encodingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 82586

llvm/trunk/include/llvm/CodeGen/MachineInstr.h

llvm/trunk/include/llvm/MC/MCStreamer.h

llvm/trunk/lib/MC/MCAsmStreamer.cpp

llvm/trunk/lib/Target/X86/CMakeLists.txt

llvm/trunk/lib/Target/X86/InstPrinter/X86InstComments.h

llvm/trunk/lib/Target/X86/X86.h

llvm/trunk/lib/Target/X86/X86EvexToVex.cpp

llvm/trunk/lib/Target/X86/X86InstrTablesInfo.h

llvm/trunk/lib/Target/X86/X86MCInstLower.cpp

llvm/trunk/lib/Target/X86/X86TargetMachine.cpp

llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/avx2-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/avx2-vbroadcast.ll

llvm/trunk/test/CodeGen/X86/avx512-arith.ll

llvm/trunk/test/CodeGen/X86/avx512-cvt.ll

llvm/trunk/test/CodeGen/X86/avx512-ext.ll

llvm/trunk/test/CodeGen/X86/avx512-gather-scatter-intrin.ll

llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll

llvm/trunk/test/CodeGen/X86/avx512-masked_memop-16-8.ll

llvm/trunk/test/CodeGen/X86/avx512-mov.ll

llvm/trunk/test/CodeGen/X86/avx512-scalar.ll

llvm/trunk/test/CodeGen/X86/avx512-vbroadcasti128.ll

llvm/trunk/test/CodeGen/X86/avx512-vbroadcasti256.ll

llvm/trunk/test/CodeGen/X86/avx512-vec-cmp.ll

llvm/trunk/test/CodeGen/X86/avx512bwvl-intrinsics-upgrade.ll

llvm/trunk/test/CodeGen/X86/avx512bwvl-intrinsics.ll

llvm/trunk/test/CodeGen/X86/avx512bwvl-mov.ll

llvm/trunk/test/CodeGen/X86/avx512dqvl-intrinsics-upgrade.ll

llvm/trunk/test/CodeGen/X86/avx512dqvl-intrinsics.ll

llvm/trunk/test/CodeGen/X86/avx512ifmavl-intrinsics.ll

llvm/trunk/test/CodeGen/X86/avx512vbmivl-intrinsics.ll

llvm/trunk/test/CodeGen/X86/avx512vl-intrinsics-upgrade.ll

llvm/trunk/test/CodeGen/X86/avx512vl-intrinsics.ll

llvm/trunk/test/CodeGen/X86/avx512vl-logic.ll

llvm/trunk/test/CodeGen/X86/avx512vl-mov.ll

llvm/trunk/test/CodeGen/X86/avx512vl-nontemporal.ll

llvm/trunk/test/CodeGen/X86/avx512vl-vbroadcast.ll

llvm/trunk/test/CodeGen/X86/compress_expand.ll

llvm/trunk/test/CodeGen/X86/evex-to-vex-compress.mir

llvm/trunk/test/CodeGen/X86/fast-isel-store.ll

llvm/trunk/test/CodeGen/X86/fp-logic-replace.ll

llvm/trunk/test/CodeGen/X86/masked_gather_scatter.ll

llvm/trunk/test/CodeGen/X86/masked_memop.ll

llvm/trunk/test/CodeGen/X86/nontemporal-2.ll

llvm/trunk/test/CodeGen/X86/sse-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/sse41-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/sse42-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/ssse3-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/subvector-broadcast.ll

llvm/trunk/test/CodeGen/X86/vec-copysign-avx512.ll

llvm/trunk/test/CodeGen/X86/vec_fabs.ll

llvm/trunk/test/CodeGen/X86/vec_fp_to_int.ll

llvm/trunk/test/CodeGen/X86/vec_fpext.ll

llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll

llvm/trunk/test/CodeGen/X86/vector-half-conversions.ll

llvm/trunk/test/CodeGen/X86/vector-lzcnt-256.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-128-v16.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-128-v2.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-128-v4.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-128-v8.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v16.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v32.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-256-v8.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-combining-avx512bwvl.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-combining-avx512vbmi.ll

llvm/trunk/test/CodeGen/X86/vector-shuffle-masked.ll

llvm/trunk/test/CodeGen/X86/vector-trunc.ll

llvm/trunk/test/CodeGen/X86/vector-tzcnt-128.ll

llvm/trunk/test/CodeGen/X86/vector-tzcnt-256.ll

llvm/trunk/test/CodeGen/X86/viabs.ll

[X86][[AVX512] Code size reduction in X86 by replacing EVEX with VEX encoding
ClosedPublic