This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
docs/
-
ReleaseNotes.rst
-
include/clang/
-
clang/
-
Basic/
1/1
BuiltinsX86.def
-
Driver/
1/1
Options.td
-
lib/
-
Basic/Targets/
-
Targets/
-
X86.h
3/3
X86.cpp
-
Headers/
-
CMakeLists.txt
6/7
avx512vlbf16intrin.h
8/8
avxneconvertintrin.h
-
cpuid.h
5/5
immintrin.h
-
test/
-
CodeGen/
-
X86/
-
avx512vlbf16-builtins.c
2/2
avxneconvert-builtins.c
-
attr-target-x86.c
-
Driver/
-
x86-target-features.c
-
Preprocessor/
1/1
x86_target_features.c
-
llvm/
-
docs/
-
ReleaseNotes.rst
-
include/llvm/
-
llvm/
-
IR/
-
IntrinsicsX86.td
-
Support/
1/1
X86TargetParser.def
-
lib/
-
Support/
1/1
Host.cpp
-
X86TargetParser.cpp
-
Target/X86/
-
X86/
-
X86.td
1/1
X86ISelLowering.cpp
-
X86InstrAVX512.td
-
X86InstrInfo.td
3/3
X86InstrSSE.td
-
test/
-
CodeGen/X86/
-
X86/
1
avxneconvert-intrinsics-shared.ll
5/6
avxneconvert-intrinsics.ll
-
MC/
-
Disassembler/X86/
-
X86/
-
avx_ne_convert-32.txt
-
avx_ne_convert-64.txt
-
X86/
-
avx_ne_convert-32-att.s
-
avx_ne_convert-32-intel.s
-
avx_ne_convert-64-att.s
-
avx_ne_convert-64-intel.s

Differential D135930

[X86] Add AVX-NE-CONVERT instructions.
ClosedPublic

Authored by FreddyYe on Oct 13 2022, 7:10 PM.

Download Raw Diff

Details

Reviewers

pengfei
RKSimon
LuoYuanke
skan

Commits

rGaee2a35ac4ab: [X86] Add AVX-NE-CONVERT instructions.

Summary

For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

FreddyYe created this revision.Oct 13 2022, 7:10 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 13 2022, 7:10 PM

Herald added subscribers: pengfei, hiraditya. · View Herald Transcript

FreddyYe requested review of this revision.Oct 13 2022, 7:10 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 13 2022, 7:10 PM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

FreddyYe retitled this revision from Add AVX-NE-CONVERT instructions. to [X86] Add AVX-NE-CONVERT instructions..Oct 13 2022, 8:17 PM

Harbormaster completed remote builds in B192102: Diff 467662.Oct 13 2022, 8:19 PM

FreddyYe added reviewers: pengfei, RKSimon, LuoYuanke, skan.Oct 13 2022, 8:29 PM

LuoYuanke added inline comments.Oct 13 2022, 8:41 PM

clang/lib/Basic/Targets/X86.cpp
795	Do we need it here?

craig.topper added a subscriber: craig.topper.Oct 13 2022, 9:37 PM

craig.topper added inline comments.

clang/lib/Headers/immintrin.h
262	Is this FIXME still relevant? Don't we support _Float16 with SSE2 now?
llvm/include/llvm/Support/X86TargetParser.def
206	Extra space before "avxneconvert"

pengfei added inline comments.Oct 13 2022, 10:18 PM

clang/lib/Basic/Targets/X86.cpp
795	We don't need it.
clang/lib/Headers/immintrin.h
262	_Float16 is supported with SSE2, but maybe we need to move `__m128h`, `__m256h` out of avx512fp16intrin.h

pengfei added inline comments.Oct 13 2022, 10:22 PM

clang/lib/Headers/avxneconvertintrin.h
48	I think the bf16 vector type may have the same problem with FP16. When need to move them out of avx512vlbf16intrin.h Another issue is we want to switch them to `__bf16` vector. Hope D132329 can be landed first.

RKSimon added inline comments.Oct 14 2022, 1:28 AM

clang/test/CodeGen/X86/avxneconvert-builtins.c
3	32-bit test coverage?

Address part of comments.

THX for reviews!

clang/lib/Headers/immintrin.h
262	Yes. This is a redundant FIXME.

FreddyYe marked 2 inline comments as done.Oct 17 2022, 4:24 AM

Harbormaster completed remote builds in B192466: Diff 468158.Oct 17 2022, 5:18 AM

RKSimon added inline comments.Oct 17 2022, 6:20 AM

llvm/test/MC/X86/avx-ne-convert-att.s
1 ↗	(On Diff #468158)	merge the att + intel test files and use --check-prefixes to test both

merge att/intel test coverage files and rename the 32/64 bit files so that they are close together in the file lists

Matt added a subscriber: Matt.Oct 19 2022, 5:03 PM

pengfei added inline comments.Oct 19 2022, 11:19 PM

clang/lib/Headers/immintrin.h
262	I have moved FP16/BF16 vector types out of original header files. rGe0fb01e9 There should be no dependency to FP16 and BF16 feature now.

pengfei added inline comments.Oct 19 2022, 11:22 PM

clang/test/CodeGen/X86/avxneconvert-builtins.c
3	This should be removed now.
llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll
3	Do we have real dependency to FP16?
4	ditto.

pengfei added inline comments.Oct 25 2022, 7:58 PM

clang/include/clang/Basic/BuiltinsX86.def
2131–2132	These should be shared with AVX512-BF16.
clang/lib/Headers/avxneconvertintrin.h
87–95	Add unified intrinsics like AVXVNNI.

Possibly rename the x86-64-* test files to *-64 (and *-32 equivalent) so that the 32/64 bit files are closer together for tracking (and to help avoid bitrot).

clang/lib/Headers/immintrin.h
262	Update to this? #if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \ (defined(__AVXNECONVERT__) && defined(__AVX512FP16__))
llvm/test/MC/X86/x86-64-avx-ne-convert-att.s
1 ↗	(On Diff #468158)	x86-64-avx-ne-convert-intel.s ?

Rebase.

Harbormaster completed remote builds in B194573: Diff 471044.Oct 27 2022, 2:37 AM

Address comments. THX for review!

Rebase.

FreddyYe added inline comments.Oct 28 2022, 12:15 AM

llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll
5	Need to add `+avx512bf16,+avx512vl` tests for shared builtin intrinsic. I just found it crashed for lacking new patterns for avx512bf16. I'll update ASAP.

Harbormaster completed remote builds in B194841: Diff 471417.Oct 28 2022, 1:12 AM

RKSimon added inline comments.Oct 28 2022, 8:43 AM

clang/lib/Headers/avx512vlbf16intrin.h
164	Is there no way for attribute to allow different attribute permutations? Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention?

pengfei added inline comments.Oct 28 2022, 9:12 AM

clang/lib/Headers/avx512vlbf16intrin.h
164	Is there no way for attribute to allow different attribute permutations? We have discussed this problem with GCC folks. There are two problems here: Unlike builtins, function attributes are more generic. It may introduce a lot of checks between callers and callees. I had a research to limit it to `__always_inline__` functions only. However, Clang handles inlining in middle-end, we don't have such information in the front-end. Besides, we don't know how to merge different permutations if they are inlining to the same function. We don't know how to put the permutations into IR's function attributes. We need to preserve all permutations for inlining reference, but the backend needs a determine feature list rather than selective.

Fix crash to compile avx512vlbf16 intrinsics.

Harbormaster completed remote builds in B195057: Diff 471710.Oct 28 2022, 10:06 PM

pengfei added inline comments.Oct 29 2022, 7:02 AM

clang/include/clang/Driver/Options.td
4599–4600	Need to move it before `mavxvnniint8` .
clang/lib/Basic/Targets/X86.cpp
1034	Move it ahead.
clang/lib/Headers/avx512vlbf16intrin.h
164	It's better to use `__builtin_ia32_cvtneps2bf16_128`.
clang/lib/Headers/avxneconvertintrin.h
107	VBCSTNESH2PS
140	VBCSTNESH2PS
208	16
274	16
340	16
406	16
clang/test/Preprocessor/x86_target_features.c
593–599	Should we check `__AVX2__` like we did for AVXVNNI?
llvm/lib/Support/Host.cpp
1819	Move it ahead and remove the blank line.
llvm/lib/Target/X86/X86ISelLowering.cpp
2181–2198	How about merge it here?
llvm/lib/Target/X86/X86InstrSSE.td
8260–8261	This can be f16 mem now.
8264–8265	f128mem, f256mem
8268–8269	ditto.
llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
129–140 ↗	(On Diff #471710)	You don't need to add them here, just another RUN in below file should be enough, e.g., ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avx512bf16,+avx512vl \| FileCheck %s --check-prefix=AVX512BF16 ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avx512bf16,+avx512vl \| FileCheck %s --check-prefix=AVX512BF16
llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll
3	--check-prefixes=CHECK,X64
4	--check-prefixes=CHECK,X86

Address comments. THX for review.

FreddyYe marked an inline comment as done.Oct 31 2022, 1:29 AM

FreddyYe added inline comments.

clang/lib/Headers/avx512vlbf16intrin.h
164	I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name. See builtin_ia32_vfmaddsubph256, builtin_ia32_minph256... And I admit naming conventions of clang builtins as well as LLVM IR builtins are confusing right now.

pengfei added inline comments.Oct 31 2022, 2:12 AM

clang/lib/Headers/avx512vlbf16intrin.h
164	The problem here is `16128` is a bit confusing, a `_` breaks it into 2 number. But I'm not insist on it :)

Harbormaster completed remote builds in B195207: Diff 471920.Oct 31 2022, 3:16 AM

FreddyYe marked an inline comment as done.Oct 31 2022, 6:56 AM

FreddyYe added inline comments.

clang/lib/Headers/avx512vlbf16intrin.h
164	I did a try but found __builtin_ia32_cvtneps2bf16_256 existed for avx512bf16, and it's used for mask intrinsic lowering currently. What about not change this time? We can do a refine patch later for avx512bf16 builtins since they also have some redundant FE/codegen logics for 256/512 mask intrinsics.

LGTM.

clang/lib/Headers/avx512vlbf16intrin.h
164	No problem.
llvm/test/CodeGen/X86/avxneconvert-intrinsics-shared.ll
3	Remove `-O0`
llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll
3	ditto.

pengfei accepted this revision.Oct 31 2022, 7:46 AM

This revision is now accepted and ready to land.Oct 31 2022, 7:46 AM

This revision was landed with ongoing or failed builds.Oct 31 2022, 8:43 AM

Closed by commit rGaee2a35ac4ab: [X86] Add AVX-NE-CONVERT instructions. (authored by FreddyYe). · Explain Why

This revision was automatically updated to reflect the committed changes.

FreddyYe marked an inline comment as done.

FreddyYe added a commit: rGaee2a35ac4ab: [X86] Add AVX-NE-CONVERT instructions..

Revision Contents

Path

Size

clang/

docs/

ReleaseNotes.rst

8 lines

include/

clang/

Basic/

BuiltinsX86.def

16 lines

Driver/

Options.td

2 lines

lib/

Basic/

Targets/

X86.h

1 line

X86.cpp

7 lines

Headers/

1 line

16 lines

484 lines

1 line

5 lines

test/

CodeGen/

X86/

avx512vlbf16-builtins.c

4 lines

avxneconvert-builtins.c

91 lines

attr-target-x86.c

4 lines

Driver/

x86-target-features.c

5 lines

Preprocessor/

x86_target_features.c

14 lines

llvm/

docs/

ReleaseNotes.rst

3 lines

include/

llvm/

IR/

IntrinsicsX86.td

28 lines

Support/

X86TargetParser.def

1 line

lib/

Support/

Host.cpp

1 line

X86TargetParser.cpp

3 lines

Target/

X86/

3 lines

16 lines

10 lines

1 line

58 lines

test/

CodeGen/

X86/

avxneconvert-intrinsics-shared.ll

40 lines

avxneconvert-intrinsics.ll

219 lines

MC/

Disassembler/

X86/

avx_ne_convert-32.txt

335 lines

avx_ne_convert-64.txt

335 lines

X86/

avx_ne_convert-32-att.s

334 lines

avx_ne_convert-32-intel.s

334 lines

avx_ne_convert-64-att.s

334 lines

avx_ne_convert-64-intel.s

334 lines

Diff 472030

clang/docs/ReleaseNotes.rst

Show First 20 Lines • Show All 656 Lines • ▼ Show 20 Lines	- Add support for ``RAO-INT`` instructions.
* Support intrinsic of ``_axor_i32/64``		* Support intrinsic of ``_axor_i32/64``
- Support ISA of ``AVX-IFMA``.		- Support ISA of ``AVX-IFMA``.
* Support intrinsic of ``_mm(256)_madd52hi_avx_epu64``.		* Support intrinsic of ``_mm(256)_madd52hi_avx_epu64``.
* Support intrinsic of ``_mm(256)_madd52lo_avx_epu64``.		* Support intrinsic of ``_mm(256)_madd52lo_avx_epu64``.
- Support ISA of ``AVX-VNNI-INT8``.		- Support ISA of ``AVX-VNNI-INT8``.
* Support intrinsic of ``_mm(256)_dpbssd(s)_epi32``.		* Support intrinsic of ``_mm(256)_dpbssd(s)_epi32``.
* Support intrinsic of ``_mm(256)_dpbsud(s)_epi32``.		* Support intrinsic of ``_mm(256)_dpbsud(s)_epi32``.
* Support intrinsic of ``_mm(256)_dpbuud(s)_epi32``.		* Support intrinsic of ``_mm(256)_dpbuud(s)_epi32``.
		- Support ISA of ``AVX-NE-CONVERT``.
		* Support intrinsic of ``_mm(256)_bcstnebf16_ps``.
		* Support intrinsic of ``_mm(256)_bcstnesh_ps``.
		* Support intrinsic of ``_mm(256)_cvtneebf16_ps``.
		* Support intrinsic of ``_mm(256)_cvtneeph_ps``.
		* Support intrinsic of ``_mm(256)_cvtneobf16_ps``.
		* Support intrinsic of ``_mm(256)_cvtneoph_ps``.
		* Support intrinsic of ``_mm(256)_cvtneps_avx_pbh``.

WebAssembly Support in Clang		WebAssembly Support in Clang
----------------------------		----------------------------

The -mcpu=generic configuration now enables sign-ext and mutable-globals. These		The -mcpu=generic configuration now enables sign-ext and mutable-globals. These
proposals are standardized and available in all major engines.		proposals are standardized and available in all major engines.

DWARF Support in Clang		DWARF Support in Clang
▲ Show 20 Lines • Show All 116 Lines • Show Last 20 Lines

clang/include/clang/Basic/BuiltinsX86.def

	Show First 20 Lines • Show All 2,110 Lines • ▼ Show 20 Lines
	TARGET_HEADER_BUILTIN(__readfsdword, "UNiUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(__readfsdword, "UNiUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(__readfsqword, "ULLiUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(__readfsqword, "ULLiUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")

	TARGET_HEADER_BUILTIN(__readgsbyte, "UcUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(__readgsbyte, "UcUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(__readgsword, "UsUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(__readgsword, "UsUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(__readgsdword, "UNiUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(__readgsdword, "UNiUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(__readgsqword, "ULLiUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(__readgsqword, "ULLiUNi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")

				// AVX-NE-CONVERT
				TARGET_BUILTIN(__builtin_ia32_vbcstnebf162ps128, "V4fyC*", "nV:128:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vbcstnebf162ps256, "V8fyC*", "nV:256:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vbcstnesh2ps128, "V4fxC*", "nV:128:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vbcstnesh2ps256, "V8fxC*", "nV:256:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneebf162ps128, "V4fV8yC*", "nV:128:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneebf162ps256, "V8fV16yC*", "nV:256:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneeph2ps128, "V4fV8xC*", "nV:128:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneeph2ps256, "V8fV16xC*", "nV:256:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneobf162ps128, "V4fV8yC*", "nV:128:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneobf162ps256, "V8fV16yC*", "nV:256:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneoph2ps128, "V4fV8xC*", "nV:128:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneoph2ps256, "V8fV16xC*", "nV:256:", "avxneconvert")
				TARGET_BUILTIN(__builtin_ia32_vcvtneps2bf16128, "V8yV4f", "nV:128:", "avx512bf16,avx512vl\|avxneconvert")
				pengfeiUnsubmitted Done Reply Inline Actions These should be shared with AVX512-BF16. pengfei: These should be shared with AVX512-BF16.
				TARGET_BUILTIN(__builtin_ia32_vcvtneps2bf16256, "V8yV8f", "nV:256:", "avx512bf16,avx512vl\|avxneconvert")

	TARGET_HEADER_BUILTIN(_InterlockedAnd64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(_InterlockedAnd64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(_InterlockedDecrement64, "WiWiD*", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(_InterlockedDecrement64, "WiWiD*", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(_InterlockedExchange64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(_InterlockedExchange64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(_InterlockedExchangeAdd64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(_InterlockedExchangeAdd64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(_InterlockedExchangeSub64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(_InterlockedExchangeSub64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(_InterlockedIncrement64, "WiWiD*", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(_InterlockedIncrement64, "WiWiD*", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(_InterlockedOr64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(_InterlockedOr64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")
	TARGET_HEADER_BUILTIN(_InterlockedXor64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")			TARGET_HEADER_BUILTIN(_InterlockedXor64, "WiWiD*Wi", "nh", "intrin.h", ALL_MS_LANGUAGES, "")

	#undef BUILTIN			#undef BUILTIN
	#undef TARGET_BUILTIN			#undef TARGET_BUILTIN
	#undef TARGET_HEADER_BUILTIN			#undef TARGET_HEADER_BUILTIN

clang/include/clang/Driver/Options.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 4,586 Lines • ▼ Show 20 Lines
	def mavx512vnni : Flag<["-"], "mavx512vnni">, Group<m_x86_Features_Group>;			def mavx512vnni : Flag<["-"], "mavx512vnni">, Group<m_x86_Features_Group>;
	def mno_avx512vnni : Flag<["-"], "mno-avx512vnni">, Group<m_x86_Features_Group>;			def mno_avx512vnni : Flag<["-"], "mno-avx512vnni">, Group<m_x86_Features_Group>;
	def mavx512vpopcntdq : Flag<["-"], "mavx512vpopcntdq">, Group<m_x86_Features_Group>;			def mavx512vpopcntdq : Flag<["-"], "mavx512vpopcntdq">, Group<m_x86_Features_Group>;
	def mno_avx512vpopcntdq : Flag<["-"], "mno-avx512vpopcntdq">, Group<m_x86_Features_Group>;			def mno_avx512vpopcntdq : Flag<["-"], "mno-avx512vpopcntdq">, Group<m_x86_Features_Group>;
	def mavx512vp2intersect : Flag<["-"], "mavx512vp2intersect">, Group<m_x86_Features_Group>;			def mavx512vp2intersect : Flag<["-"], "mavx512vp2intersect">, Group<m_x86_Features_Group>;
	def mno_avx512vp2intersect : Flag<["-"], "mno-avx512vp2intersect">, Group<m_x86_Features_Group>;			def mno_avx512vp2intersect : Flag<["-"], "mno-avx512vp2intersect">, Group<m_x86_Features_Group>;
	def mavxifma : Flag<["-"], "mavxifma">, Group<m_x86_Features_Group>;			def mavxifma : Flag<["-"], "mavxifma">, Group<m_x86_Features_Group>;
	def mno_avxifma : Flag<["-"], "mno-avxifma">, Group<m_x86_Features_Group>;			def mno_avxifma : Flag<["-"], "mno-avxifma">, Group<m_x86_Features_Group>;
				def mavxneconvert : Flag<["-"], "mavxneconvert">, Group<m_x86_Features_Group>;
				def mno_avxneconvert : Flag<["-"], "mno-avxneconvert">, Group<m_x86_Features_Group>;
	def mavxvnniint8 : Flag<["-"], "mavxvnniint8">, Group<m_x86_Features_Group>;			def mavxvnniint8 : Flag<["-"], "mavxvnniint8">, Group<m_x86_Features_Group>;
	def mno_avxvnniint8 : Flag<["-"], "mno-avxvnniint8">, Group<m_x86_Features_Group>;			def mno_avxvnniint8 : Flag<["-"], "mno-avxvnniint8">, Group<m_x86_Features_Group>;
	def mavxvnni : Flag<["-"], "mavxvnni">, Group<m_x86_Features_Group>;			def mavxvnni : Flag<["-"], "mavxvnni">, Group<m_x86_Features_Group>;
	def mno_avxvnni : Flag<["-"], "mno-avxvnni">, Group<m_x86_Features_Group>;			def mno_avxvnni : Flag<["-"], "mno-avxvnni">, Group<m_x86_Features_Group>;
				pengfeiUnsubmitted Done Reply Inline Actions Need to move it before `mavxvnniint8` . pengfei: Need to move it before `mavxvnniint8 `.
	def madx : Flag<["-"], "madx">, Group<m_x86_Features_Group>;			def madx : Flag<["-"], "madx">, Group<m_x86_Features_Group>;
	def mno_adx : Flag<["-"], "mno-adx">, Group<m_x86_Features_Group>;			def mno_adx : Flag<["-"], "mno-adx">, Group<m_x86_Features_Group>;
	def maes : Flag<["-"], "maes">, Group<m_x86_Features_Group>;			def maes : Flag<["-"], "maes">, Group<m_x86_Features_Group>;
	def mno_aes : Flag<["-"], "mno-aes">, Group<m_x86_Features_Group>;			def mno_aes : Flag<["-"], "mno-aes">, Group<m_x86_Features_Group>;
	def mbmi : Flag<["-"], "mbmi">, Group<m_x86_Features_Group>;			def mbmi : Flag<["-"], "mbmi">, Group<m_x86_Features_Group>;
	def mno_bmi : Flag<["-"], "mno-bmi">, Group<m_x86_Features_Group>;			def mno_bmi : Flag<["-"], "mno-bmi">, Group<m_x86_Features_Group>;
	def mbmi2 : Flag<["-"], "mbmi2">, Group<m_x86_Features_Group>;			def mbmi2 : Flag<["-"], "mbmi2">, Group<m_x86_Features_Group>;
	def mno_bmi2 : Flag<["-"], "mno-bmi2">, Group<m_x86_Features_Group>;			def mno_bmi2 : Flag<["-"], "mno-bmi2">, Group<m_x86_Features_Group>;
	▲ Show 20 Lines • Show All 2,410 Lines • Show Last 20 Lines

clang/lib/Basic/Targets/X86.h

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	class LLVM_LIBRARY_VISIBILITY X86TargetInfo : public TargetInfo {
bool HasMOVDIR64B = false;		bool HasMOVDIR64B = false;
bool HasPTWRITE = false;		bool HasPTWRITE = false;
bool HasINVPCID = false;		bool HasINVPCID = false;
bool HasENQCMD = false;		bool HasENQCMD = false;
bool HasAMXFP16 = false;		bool HasAMXFP16 = false;
bool HasCMPCCXADD = false;		bool HasCMPCCXADD = false;
bool HasRAOINT = false;		bool HasRAOINT = false;
bool HasAVXVNNIINT8 = false;		bool HasAVXVNNIINT8 = false;
		bool HasAVXNECONVERT = false;
bool HasKL = false; // For key locker		bool HasKL = false; // For key locker
bool HasWIDEKL = false; // For wide key locker		bool HasWIDEKL = false; // For wide key locker
bool HasHRESET = false;		bool HasHRESET = false;
bool HasAVXVNNI = false;		bool HasAVXVNNI = false;
bool HasAMXTILE = false;		bool HasAMXTILE = false;
bool HasAMXINT8 = false;		bool HasAMXINT8 = false;
bool HasAMXBF16 = false;		bool HasAMXBF16 = false;
bool HasSERIALIZE = false;		bool HasSERIALIZE = false;
▲ Show 20 Lines • Show All 807 Lines • Show Last 20 Lines

clang/lib/Basic/Targets/X86.cpp

Show First 20 Lines • Show All 334 Lines • ▼ Show 20 Lines	for (const auto &Feature : Features) {
} else if (Feature == "+amx-tile") {		} else if (Feature == "+amx-tile") {
HasAMXTILE = true;		HasAMXTILE = true;
} else if (Feature == "+cmpccxadd") {		} else if (Feature == "+cmpccxadd") {
HasCMPCCXADD = true;		HasCMPCCXADD = true;
} else if (Feature == "+raoint") {		} else if (Feature == "+raoint") {
HasRAOINT = true;		HasRAOINT = true;
} else if (Feature == "+avxifma") {		} else if (Feature == "+avxifma") {
HasAVXIFMA = true;		HasAVXIFMA = true;
		} else if (Feature == "+avxneconvert") {
		HasAVXNECONVERT= true;
} else if (Feature == "+avxvnni") {		} else if (Feature == "+avxvnni") {
HasAVXVNNI = true;		HasAVXVNNI = true;
} else if (Feature == "+avxvnniint8") {		} else if (Feature == "+avxvnniint8") {
HasAVXVNNIINT8 = true;		HasAVXVNNIINT8 = true;
} else if (Feature == "+serialize") {		} else if (Feature == "+serialize") {
HasSERIALIZE = true;		HasSERIALIZE = true;
} else if (Feature == "+tsxldtrk") {		} else if (Feature == "+tsxldtrk") {
HasTSXLDTRK = true;		HasTSXLDTRK = true;
▲ Show 20 Lines • Show All 434 Lines • ▼ Show 20 Lines	void X86TargetInfo::getTargetDefines(const LangOptions &Opts,
if (HasAMXTILE)		if (HasAMXTILE)
Builder.defineMacro("__AMXTILE__");		Builder.defineMacro("__AMXTILE__");
if (HasAMXINT8)		if (HasAMXINT8)
Builder.defineMacro("__AMXINT8__");		Builder.defineMacro("__AMXINT8__");
if (HasAMXBF16)		if (HasAMXBF16)
Builder.defineMacro("__AMXBF16__");		Builder.defineMacro("__AMXBF16__");
if (HasAMXFP16)		if (HasAMXFP16)
Builder.defineMacro("__AMXFP16__");		Builder.defineMacro("__AMXFP16__");
if (HasCMPCCXADD)		if (HasCMPCCXADD)
		LuoYuankeUnsubmitted Done Reply Inline Actions Do we need it here? LuoYuanke: Do we need it here?
		pengfeiUnsubmitted Done Reply Inline Actions We don't need it. pengfei: We don't need it.
Builder.defineMacro("__CMPCCXADD__");		Builder.defineMacro("__CMPCCXADD__");
if (HasRAOINT)		if (HasRAOINT)
Builder.defineMacro("__RAOINT__");		Builder.defineMacro("__RAOINT__");
if (HasAVXIFMA)		if (HasAVXIFMA)
Builder.defineMacro("__AVXIFMA__");		Builder.defineMacro("__AVXIFMA__");
		if (HasAVXNECONVERT)
		Builder.defineMacro("__AVXNECONVERT__");
if (HasAVXVNNI)		if (HasAVXVNNI)
Builder.defineMacro("__AVXVNNI__");		Builder.defineMacro("__AVXVNNI__");
if (HasAVXVNNIINT8)		if (HasAVXVNNIINT8)
Builder.defineMacro("__AVXVNNIINT8__");		Builder.defineMacro("__AVXVNNIINT8__");
if (HasSERIALIZE)		if (HasSERIALIZE)
Builder.defineMacro("__SERIALIZE__");		Builder.defineMacro("__SERIALIZE__");
if (HasTSXLDTRK)		if (HasTSXLDTRK)
Builder.defineMacro("__TSXLDTRK__");		Builder.defineMacro("__TSXLDTRK__");
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	return llvm::StringSwitch<bool>(Name)
.Case("avx512bitalg", true)		.Case("avx512bitalg", true)
.Case("avx512bw", true)		.Case("avx512bw", true)
.Case("avx512vl", true)		.Case("avx512vl", true)
.Case("avx512vbmi", true)		.Case("avx512vbmi", true)
.Case("avx512vbmi2", true)		.Case("avx512vbmi2", true)
.Case("avx512ifma", true)		.Case("avx512ifma", true)
.Case("avx512vp2intersect", true)		.Case("avx512vp2intersect", true)
.Case("avxifma", true)		.Case("avxifma", true)
		.Case("avxneconvert", true)
.Case("avxvnni", true)		.Case("avxvnni", true)
.Case("avxvnniint8", true)		.Case("avxvnniint8", true)
.Case("bmi", true)		.Case("bmi", true)
.Case("bmi2", true)		.Case("bmi2", true)
.Case("cldemote", true)		.Case("cldemote", true)
.Case("clflushopt", true)		.Case("clflushopt", true)
.Case("clwb", true)		.Case("clwb", true)
.Case("clzero", true)		.Case("clzero", true)
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	return llvm::StringSwitch<bool>(Feature)
.Case("avx512bitalg", HasAVX512BITALG)		.Case("avx512bitalg", HasAVX512BITALG)
.Case("avx512bw", HasAVX512BW)		.Case("avx512bw", HasAVX512BW)
.Case("avx512vl", HasAVX512VL)		.Case("avx512vl", HasAVX512VL)
.Case("avx512vbmi", HasAVX512VBMI)		.Case("avx512vbmi", HasAVX512VBMI)
.Case("avx512vbmi2", HasAVX512VBMI2)		.Case("avx512vbmi2", HasAVX512VBMI2)
.Case("avx512ifma", HasAVX512IFMA)		.Case("avx512ifma", HasAVX512IFMA)
.Case("avx512vp2intersect", HasAVX512VP2INTERSECT)		.Case("avx512vp2intersect", HasAVX512VP2INTERSECT)
.Case("avxifma", HasAVXIFMA)		.Case("avxifma", HasAVXIFMA)
.Case("avxvnni", HasAVXVNNI)		.Case("avxneconvert", HasAVXNECONVERT)
.Case("avxvnni", HasAVXVNNI)		.Case("avxvnni", HasAVXVNNI)
.Case("avxvnniint8", HasAVXVNNIINT8)		.Case("avxvnniint8", HasAVXVNNIINT8)
.Case("bmi", HasBMI)		.Case("bmi", HasBMI)
		pengfeiUnsubmitted Done Reply Inline Actions Move it ahead. pengfei: Move it ahead.
.Case("bmi2", HasBMI2)		.Case("bmi2", HasBMI2)
.Case("cldemote", HasCLDEMOTE)		.Case("cldemote", HasCLDEMOTE)
.Case("clflushopt", HasCLFLUSHOPT)		.Case("clflushopt", HasCLFLUSHOPT)
.Case("clwb", HasCLWB)		.Case("clwb", HasCLWB)
.Case("clzero", HasCLZERO)		.Case("clzero", HasCLZERO)
.Case("cmpccxadd", HasCMPCCXADD)		.Case("cmpccxadd", HasCMPCCXADD)
.Case("crc32", HasCRC32)		.Case("crc32", HasCRC32)
.Case("cx8", HasCX8)		.Case("cx8", HasCX8)
▲ Show 20 Lines • Show All 558 Lines • Show Last 20 Lines

clang/lib/Headers/CMakeLists.txt

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	# Intrinsics
avx512vlvnniintrin.h		avx512vlvnniintrin.h
avx512vlvp2intersectintrin.h		avx512vlvp2intersectintrin.h
avx512vnniintrin.h		avx512vnniintrin.h
avx512vp2intersectintrin.h		avx512vp2intersectintrin.h
avx512vpopcntdqintrin.h		avx512vpopcntdqintrin.h
avx512vpopcntdqvlintrin.h		avx512vpopcntdqvlintrin.h
avxifmaintrin.h		avxifmaintrin.h
avxintrin.h		avxintrin.h
		avxneconvertintrin.h
avxvnniint8intrin.h		avxvnniint8intrin.h
avxvnniintrin.h		avxvnniintrin.h
bmi2intrin.h		bmi2intrin.h
bmiintrin.h		bmiintrin.h
cetintrin.h		cetintrin.h
cldemoteintrin.h		cldemoteintrin.h
clflushoptintrin.h		clflushoptintrin.h
clwbintrin.h		clwbintrin.h
▲ Show 20 Lines • Show All 517 Lines • Show Last 20 Lines

clang/lib/Headers/avx512vlbf16intrin.h

	Show First 20 Lines • Show All 154 Lines • ▼ Show 20 Lines
	/// \headerfile <x86intrin.h>			/// \headerfile <x86intrin.h>
	///			///
	/// This intrinsic corresponds to the <c> VCVTNEPS2BF16 </c> instructions.			/// This intrinsic corresponds to the <c> VCVTNEPS2BF16 </c> instructions.
	///			///
	/// \param __A			/// \param __A
	/// A 128-bit vector of [4 x float].			/// A 128-bit vector of [4 x float].
	/// \returns A 128-bit vector of [8 x bfloat] whose lower 64 bits come from			/// \returns A 128-bit vector of [8 x bfloat] whose lower 64 bits come from
	/// conversion of __A, and higher 64 bits are 0.			/// conversion of __A, and higher 64 bits are 0.
	static __inline__ __m128bh __DEFAULT_FN_ATTRS128			#define _mm_cvtneps_pbh(A) \
	_mm_cvtneps_pbh(__m128 __A) {			((__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)(A)))
				RKSimonUnsubmitted Done Reply Inline Actions Is there no way for attribute to allow different attribute permutations? Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention? RKSimon: Is there no way for __attribute__ to allow different attribute permutations? Also, can we keep…
				pengfeiUnsubmitted Done Reply Inline Actions Is there no way for attribute to allow different attribute permutations? We have discussed this problem with GCC folks. There are two problems here: Unlike builtins, function attributes are more generic. It may introduce a lot of checks between callers and callees. I had a research to limit it to `__always_inline__` functions only. However, Clang handles inlining in middle-end, we don't have such information in the front-end. Besides, we don't know how to merge different permutations if they are inlining to the same function. We don't know how to put the permutations into IR's function attributes. We need to preserve all permutations for inlining reference, but the backend needs a determine feature list rather than selective. pengfei: > Is there no way for attribute to allow different attribute permutations? We have discussed…
				pengfeiUnsubmitted Done Reply Inline Actions It's better to use `__builtin_ia32_cvtneps2bf16_128`. pengfei: It's better to use `__builtin_ia32_cvtneps2bf16_128`.
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name. See builtin_ia32_vfmaddsubph256, builtin_ia32_minph256... And I admit naming conventions of clang builtins as well as LLVM IR builtins are confusing right now. FreddyYe: I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name. See…
				pengfeiUnsubmitted Done Reply Inline Actions The problem here is `16128` is a bit confusing, a `_` breaks it into 2 number. But I'm not insist on it :) pengfei: The problem here is `16128` is a bit confusing, a `_` breaks it into 2 number. But I'm not…
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions I did a try but found __builtin_ia32_cvtneps2bf16_256 existed for avx512bf16, and it's used for mask intrinsic lowering currently. What about not change this time? We can do a refine patch later for avx512bf16 builtins since they also have some redundant FE/codegen logics for 256/512 mask intrinsics. FreddyYe: I did a try but found __builtin_ia32_cvtneps2bf16_256 existed for avx512bf16, and it's used for…
				pengfeiUnsubmitted Not Done Reply Inline Actions No problem. pengfei: No problem.
	return (__m128bh)__builtin_ia32_cvtneps2bf16_128_mask((__v4sf) __A,
	(__v8bf)_mm_undefined_si128(),
	(__mmask8)-1);
	}

	/// Convert Packed Single Data to Packed BF16 Data.			/// Convert Packed Single Data to Packed BF16 Data.
	///			///
	/// \headerfile <x86intrin.h>			/// \headerfile <x86intrin.h>
	///			///
	/// This intrinsic corresponds to the <c> VCVTNEPS2BF16 </c> instructions.			/// This intrinsic corresponds to the <c> VCVTNEPS2BF16 </c> instructions.
	///			///
	/// \param __A			/// \param __A
	Show All 36 Lines
	///			///
	/// \headerfile <x86intrin.h>			/// \headerfile <x86intrin.h>
	///			///
	/// This intrinsic corresponds to the <c> VCVTNEPS2BF16 </c> instructions.			/// This intrinsic corresponds to the <c> VCVTNEPS2BF16 </c> instructions.
	///			///
	/// \param __A			/// \param __A
	/// A 256-bit vector of [8 x float].			/// A 256-bit vector of [8 x float].
	/// \returns A 128-bit vector of [8 x bfloat] comes from conversion of __A.			/// \returns A 128-bit vector of [8 x bfloat] comes from conversion of __A.
	static __inline__ __m128bh __DEFAULT_FN_ATTRS256			#define _mm256_cvtneps_pbh(A) \
	_mm256_cvtneps_pbh(__m256 __A) {			((__m128bh)__builtin_ia32_vcvtneps2bf16256((__v8sf)(A)))
	return (__m128bh)__builtin_ia32_cvtneps2bf16_256_mask((__v8sf)__A,
	(__v8bf)_mm_undefined_si128(),
	(__mmask8)-1);
	}

	/// Convert Packed Single Data to Packed BF16 Data.			/// Convert Packed Single Data to Packed BF16 Data.
	///			///
	/// \headerfile <x86intrin.h>			/// \headerfile <x86intrin.h>
	///			///
	/// This intrinsic corresponds to the <c> VCVTNEPS2BF16 </c> instructions.			/// This intrinsic corresponds to the <c> VCVTNEPS2BF16 </c> instructions.
	///			///
	/// \param __A			/// \param __A
	▲ Show 20 Lines • Show All 289 Lines • Show Last 20 Lines

clang/lib/Headers/avxneconvertintrin.h

This file was added.

				/*===-------------- avxneconvertintrin.h - AVXNECONVERT --------------------===
				*
				* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				* See https://llvm.org/LICENSE.txt for license information.
				* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				*
				*===-----------------------------------------------------------------------===
				*/

				#ifndef __IMMINTRIN_H
				#error \
				"Never use <avxneconvertintrin.h> directly; include <immintrin.h> instead."
				#endif // __IMMINTRIN_H

				#ifdef __SSE2__

				#ifndef __AVXNECONVERTINTRIN_H
				#define __AVXNECONVERTINTRIN_H

				/* Define the default attributes for the functions in this file. */
				#define __DEFAULT_FN_ATTRS128 \
				__attribute__((__always_inline__, __nodebug__, __target__("avxneconvert"), \
				__min_vector_width__(128)))
				#define __DEFAULT_FN_ATTRS256 \
				__attribute__((__always_inline__, __nodebug__, __target__("avxneconvert"), \
				__min_vector_width__(256)))

				/// Convert scalar BF16 (16-bit) floating-point element
				/// stored at memory locations starting at location \a __A to a
				/// single-precision (32-bit) floating-point, broadcast it to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm_bcstnebf16_ps(const void *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VBCSTNEBF162PS instruction.
				///
				/// \param __A
				/// A pointer to a 16-bit memory location. The address of the memory
				/// location does not have to be aligned.
				/// \returns
				/// A 128-bit vector of [4 x float].
				///
				/// \code{.operation}
				pengfeiUnsubmitted Done Reply Inline Actions I think the bf16 vector type may have the same problem with FP16. When need to move them out of avx512vlbf16intrin.h Another issue is we want to switch them to `__bf16` vector. Hope D132329 can be landed first. pengfei: I think the bf16 vector type may have the same problem with FP16. When need to move them out of…
				/// b := Convert_BF16_To_FP32(MEM[__A+15:__A])
				/// FOR j := 0 to 3
				/// m := j*32
				/// dst[m+31:m] := b
				/// ENDFOR
				/// dst[MAX:128] := 0
				/// \endcode
				static __inline__ __m128 __DEFAULT_FN_ATTRS128
				_mm_bcstnebf16_ps(const void *__A) {
				return (__m128)__builtin_ia32_vbcstnebf162ps128((const __bf16 *)__A);
				}

				/// Convert scalar BF16 (16-bit) floating-point element
				/// stored at memory locations starting at location \a __A to a
				/// single-precision (32-bit) floating-point, broadcast it to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm256_bcstnebf16_ps(const void *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VBCSTNEBF162PS instruction.
				///
				/// \param __A
				/// A pointer to a 16-bit memory location. The address of the memory
				/// location does not have to be aligned.
				/// \returns
				/// A 256-bit vector of [8 x float].
				///
				/// \code{.operation}
				/// b := Convert_BF16_To_FP32(MEM[__A+15:__A])
				/// FOR j := 0 to 7
				/// m := j*32
				/// dst[m+31:m] := b
				/// ENDFOR
				/// dst[MAX:256] := 0
				/// \endcode
				static __inline__ __m256 __DEFAULT_FN_ATTRS256
				_mm256_bcstnebf16_ps(const void *__A) {
				return (__m256)__builtin_ia32_vbcstnebf162ps256((const __bf16 *)__A);
				}

				/// Convert scalar half-precision (16-bit) floating-point element
				/// stored at memory locations starting at location \a __A to a
				pengfeiUnsubmitted Done Reply Inline Actions Add unified intrinsics like AVXVNNI. pengfei: Add unified intrinsics like AVXVNNI.
				/// single-precision (32-bit) floating-point, broadcast it to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm_bcstnesh_ps(const void *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VBCSTNESH2PS instruction.
				///
				pengfeiUnsubmitted Done Reply Inline Actions VBCSTNESH2PS pengfei: VBCSTNESH2PS
				/// \param __A
				/// A pointer to a 16-bit memory location. The address of the memory
				/// location does not have to be aligned.
				/// \returns
				/// A 128-bit vector of [4 x float].
				///
				/// \code{.operation}
				/// b := Convert_FP16_To_FP32(MEM[__A+15:__A])
				/// FOR j := 0 to 3
				/// m := j*32
				/// dst[m+31:m] := b
				/// ENDFOR
				/// dst[MAX:128] := 0
				/// \endcode
				static __inline__ __m128 __DEFAULT_FN_ATTRS128
				_mm_bcstnesh_ps(const void *__A) {
				return (__m128)__builtin_ia32_vbcstnesh2ps128((const _Float16 *)__A);
				}

				/// Convert scalar half-precision (16-bit) floating-point element
				/// stored at memory locations starting at location \a __A to a
				/// single-precision (32-bit) floating-point, broadcast it to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm256_bcstnesh_ps(const void *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VBCSTNESH2PS instruction.
				///
				pengfeiUnsubmitted Done Reply Inline Actions VBCSTNESH2PS pengfei: VBCSTNESH2PS
				/// \param __A
				/// A pointer to a 16-bit memory location. The address of the memory
				/// location does not have to be aligned.
				/// \returns
				/// A 256-bit vector of [8 x float].
				///
				/// \code{.operation}
				/// b := Convert_FP16_To_FP32(MEM[__A+15:__A])
				/// FOR j := 0 to 7
				/// m := j*32
				/// dst[m+31:m] := b
				/// ENDFOR
				/// dst[MAX:256] := 0
				/// \endcode
				static __inline__ __m256 __DEFAULT_FN_ATTRS256
				_mm256_bcstnesh_ps(const void *__A) {
				return (__m256)__builtin_ia32_vbcstnesh2ps256((const _Float16 *)__A);
				}

				/// Convert packed BF16 (16-bit) floating-point even-indexed elements
				/// stored at memory locations starting at location \a __A to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm_cvtneebf16_ps(const __m128bh *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEEBF162PS instruction.
				///
				/// \param __A
				/// A pointer to a 128-bit memory location containing 8 consecutive
				/// BF16 (16-bit) floating-point values.
				/// \returns
				/// A 128-bit vector of [4 x float].
				///
				/// \code{.operation}
				/// FOR j := 0 to 3
				/// k := j*2
				/// i := k*16
				/// m := j*32
				/// dst[m+31:m] := Convert_BF16_To_FP32(MEM[__A+i+15:__A+i])
				/// ENDFOR
				/// dst[MAX:128] := 0
				/// \endcode
				static __inline__ __m128 __DEFAULT_FN_ATTRS128
				_mm_cvtneebf16_ps(const __m128bh *__A) {
				return (__m128)__builtin_ia32_vcvtneebf162ps128((const __v8bf *)__A);
				}

				/// Convert packed BF16 (16-bit) floating-point even-indexed elements
				/// stored at memory locations starting at location \a __A to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm256_cvtneebf16_ps(const __m256bh *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEEBF162PS instruction.
				///
				/// \param __A
				/// A pointer to a 256-bit memory location containing 16 consecutive
				/// BF16 (16-bit) floating-point values.
				pengfeiUnsubmitted Done Reply Inline Actions 16 pengfei: 16
				/// \returns
				/// A 256-bit vector of [8 x float].
				///
				/// \code{.operation}
				/// FOR j := 0 to 7
				/// k := j*2
				/// i := k*16
				/// m := j*32
				/// dst[m+31:m] := Convert_BF16_To_FP32(MEM[__A+i+15:__A+i])
				/// ENDFOR
				/// dst[MAX:256] := 0
				/// \endcode
				static __inline__ __m256 __DEFAULT_FN_ATTRS256
				_mm256_cvtneebf16_ps(const __m256bh *__A) {
				return (__m256)__builtin_ia32_vcvtneebf162ps256((const __v16bf *)__A);
				}

				/// Convert packed half-precision (16-bit) floating-point even-indexed elements
				/// stored at memory locations starting at location \a __A to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm_cvtneeph_ps(const __m128h *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEEPH2PS instruction.
				///
				/// \param __A
				/// A pointer to a 128-bit memory location containing 8 consecutive
				/// half-precision (16-bit) floating-point values.
				/// \returns
				/// A 128-bit vector of [4 x float].
				///
				/// \code{.operation}
				/// FOR j := 0 to 3
				/// k := j*2
				/// i := k*16
				/// m := j*32
				/// dst[m+31:m] := Convert_FP16_To_FP32(MEM[__A+i+15:__A+i])
				/// ENDFOR
				/// dst[MAX:128] := 0
				/// \endcode
				static __inline__ __m128 __DEFAULT_FN_ATTRS128
				_mm_cvtneeph_ps(const __m128h *__A) {
				return (__m128)__builtin_ia32_vcvtneeph2ps128((const __v8hf *)__A);
				}

				/// Convert packed half-precision (16-bit) floating-point even-indexed elements
				/// stored at memory locations starting at location \a __A to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm256_cvtneeph_ps(const __m256h *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEEPH2PS instruction.
				///
				/// \param __A
				/// A pointer to a 256-bit memory location containing 16 consecutive
				/// half-precision (16-bit) floating-point values.
				pengfeiUnsubmitted Done Reply Inline Actions 16 pengfei: 16
				/// \returns
				/// A 256-bit vector of [8 x float].
				///
				/// \code{.operation}
				/// FOR j := 0 to 7
				/// k := j*2
				/// i := k*16
				/// m := j*32
				/// dst[m+31:m] := Convert_FP16_To_FP32(MEM[__A+i+15:__A+i])
				/// ENDFOR
				/// dst[MAX:256] := 0
				/// \endcode
				static __inline__ __m256 __DEFAULT_FN_ATTRS256
				_mm256_cvtneeph_ps(const __m256h *__A) {
				return (__m256)__builtin_ia32_vcvtneeph2ps256((const __v16hf *)__A);
				}

				/// Convert packed BF16 (16-bit) floating-point odd-indexed elements
				/// stored at memory locations starting at location \a __A to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm_cvtneobf16_ps(const __m128bh *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEOBF162PS instruction.
				///
				/// \param __A
				/// A pointer to a 128-bit memory location containing 8 consecutive
				/// BF16 (16-bit) floating-point values.
				/// \returns
				/// A 128-bit vector of [4 x float].
				///
				/// \code{.operation}
				/// FOR j := 0 to 3
				/// k := j*2+1
				/// i := k*16
				/// m := j*32
				/// dst[m+31:m] := Convert_BF16_To_FP32(MEM[__A+i+15:__A+i])
				/// ENDFOR
				/// dst[MAX:128] := 0
				/// \endcode
				static __inline__ __m128 __DEFAULT_FN_ATTRS128
				_mm_cvtneobf16_ps(const __m128bh *__A) {
				return (__m128)__builtin_ia32_vcvtneobf162ps128((const __v8bf *)__A);
				}

				/// Convert packed BF16 (16-bit) floating-point odd-indexed elements
				/// stored at memory locations starting at location \a __A to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm256_cvtneobf16_ps(const __m256bh *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEOBF162PS instruction.
				///
				/// \param __A
				/// A pointer to a 256-bit memory location containing 16 consecutive
				/// BF16 (16-bit) floating-point values.
				pengfeiUnsubmitted Done Reply Inline Actions 16 pengfei: 16
				/// \returns
				/// A 256-bit vector of [8 x float].
				///
				/// \code{.operation}
				/// FOR j := 0 to 7
				/// k := j*2+1
				/// i := k*16
				/// m := j*32
				/// dst[m+31:m] := Convert_BF16_To_FP32(MEM[__A+i+15:__A+i])
				/// ENDFOR
				/// dst[MAX:256] := 0
				/// \endcode
				static __inline__ __m256 __DEFAULT_FN_ATTRS256
				_mm256_cvtneobf16_ps(const __m256bh *__A) {
				return (__m256)__builtin_ia32_vcvtneobf162ps256((const __v16bf *)__A);
				}

				/// Convert packed half-precision (16-bit) floating-point odd-indexed elements
				/// stored at memory locations starting at location \a __A to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm_cvtneoph_ps(const __m128h *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEOPH2PS instruction.
				///
				/// \param __A
				/// A pointer to a 128-bit memory location containing 8 consecutive
				/// half-precision (16-bit) floating-point values.
				/// \returns
				/// A 128-bit vector of [4 x float].
				///
				/// \code{.operation}
				/// FOR j := 0 to 3
				/// k := j*2+1
				/// i := k*16
				/// m := j*32
				/// dst[m+31:m] := Convert_FP16_To_FP32(MEM[__A+i+15:__A+i])
				/// ENDFOR
				/// dst[MAX:128] := 0
				/// \endcode
				static __inline__ __m128 __DEFAULT_FN_ATTRS128
				_mm_cvtneoph_ps(const __m128h *__A) {
				return (__m128)__builtin_ia32_vcvtneoph2ps128((const __v8hf *)__A);
				}

				/// Convert packed half-precision (16-bit) floating-point odd-indexed elements
				/// stored at memory locations starting at location \a __A to packed
				/// single-precision (32-bit) floating-point elements, and store the results in
				/// \a dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm256_cvtneoph_ps(const __m256h *__A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEOPH2PS instruction.
				///
				/// \param __A
				/// A pointer to a 256-bit memory location containing 16 consecutive
				/// half-precision (16-bit) floating-point values.
				pengfeiUnsubmitted Done Reply Inline Actions 16 pengfei: 16
				/// \returns
				/// A 256-bit vector of [8 x float].
				///
				/// \code{.operation}
				/// FOR j := 0 to 7
				/// k := j*2+1
				/// i := k*16
				/// m := j*32
				/// dst[m+31:m] := Convert_FP16_To_FP32(MEM[__A+i+15:__A+i])
				/// ENDFOR
				/// dst[MAX:256] := 0
				/// \endcode
				static __inline__ __m256 __DEFAULT_FN_ATTRS256
				_mm256_cvtneoph_ps(const __m256h *__A) {
				return (__m256)__builtin_ia32_vcvtneoph2ps256((const __v16hf *)__A);
				}

				/// Convert packed single-precision (32-bit) floating-point elements in \a __A
				/// to packed BF16 (16-bit) floating-point elements, and store the results in \a
				/// dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm_cvtneps_avx_pbh(__m128 __A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEPS2BF16 instruction.
				///
				/// \param __A
				/// A 128-bit vector of [4 x float].
				/// \returns
				/// A 128-bit vector of [8 x bfloat].
				///
				/// \code{.operation}
				/// FOR j := 0 to 3
				/// dst.word[j] := Convert_FP32_To_BF16(__A.fp32[j])
				/// ENDFOR
				/// dst[MAX:128] := 0
				/// \endcode
				static __inline__ __m128bh __DEFAULT_FN_ATTRS128
				_mm_cvtneps_avx_pbh(__m128 __A) {
				return (__m128bh)__builtin_ia32_vcvtneps2bf16128((__v4sf)__A);
				}

				/// Convert packed single-precision (32-bit) floating-point elements in \a __A
				/// to packed BF16 (16-bit) floating-point elements, and store the results in \a
				/// dst.
				///
				/// \headerfile <x86intrin.h>
				///
				/// \code
				/// _mm256_cvtneps_avx_pbh(__m256 __A);
				/// \endcode
				///
				/// This intrinsic corresponds to the \c VCVTNEPS2BF16 instruction.
				///
				/// \param __A
				/// A 256-bit vector of [8 x float].
				/// \returns
				/// A 128-bit vector of [8 x bfloat].
				///
				/// \code{.operation}
				/// FOR j := 0 to 7
				/// dst.word[j] := Convert_FP32_To_BF16(a.fp32[j])
				/// ENDFOR
				/// dst[MAX:128] := 0
				/// \endcode
				static __inline__ __m128bh __DEFAULT_FN_ATTRS256
				_mm256_cvtneps_avx_pbh(__m256 __A) {
				return (__m128bh)__builtin_ia32_vcvtneps2bf16256((__v8sf)__A);
				}

				#undef __DEFAULT_FN_ATTRS128
				#undef __DEFAULT_FN_ATTRS256

				#endif // __AVXNECONVERTINTRIN_H
				#endif // __SSE2__

clang/lib/Headers/cpuid.h

	Show First 20 Lines • Show All 204 Lines • ▼ Show 20 Lines
	#define bit_AVX512BF16 0x00000020			#define bit_AVX512BF16 0x00000020
	#define bit_CMPCCXADD 0x00000080			#define bit_CMPCCXADD 0x00000080
	#define bit_AMXFP16 0x00200000			#define bit_AMXFP16 0x00200000
	#define bit_HRESET 0x00400000			#define bit_HRESET 0x00400000
	#define bit_AVXIFMA 0x00800000			#define bit_AVXIFMA 0x00800000

	/* Features in %edx for leaf 7 sub-leaf 1 */			/* Features in %edx for leaf 7 sub-leaf 1 */
	#define bit_AVXVNNIINT8 0x00000010			#define bit_AVXVNNIINT8 0x00000010
				#define bit_AVXNECONVERT 0x00000020
	#define bit_PREFETCHI 0x00004000			#define bit_PREFETCHI 0x00004000

	/* Features in %eax for leaf 13 sub-leaf 1 */			/* Features in %eax for leaf 13 sub-leaf 1 */
	#define bit_XSAVEOPT 0x00000001			#define bit_XSAVEOPT 0x00000001
	#define bit_XSAVEC 0x00000002			#define bit_XSAVEC 0x00000002
	#define bit_XSAVES 0x00000008			#define bit_XSAVES 0x00000008

	/* Features in %eax for leaf 0x14 sub-leaf 0 */			/* Features in %eax for leaf 0x14 sub-leaf 0 */
	▲ Show 20 Lines • Show All 110 Lines • Show Last 20 Lines

clang/lib/Headers/immintrin.h

	Show First 20 Lines • Show All 253 Lines • ▼ Show 20 Lines
	#include <vaesintrin.h>			#include <vaesintrin.h>
	#endif			#endif

	#if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \			#if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \
	defined(__GFNI__)			defined(__GFNI__)
	#include <gfniintrin.h>			#include <gfniintrin.h>
	#endif			#endif

	#if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \			#if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \
				craig.topperUnsubmitted Done Reply Inline Actions Is this FIXME still relevant? Don't we support _Float16 with SSE2 now? craig.topper: Is this FIXME still relevant? Don't we support _Float16 with SSE2 now?
				pengfeiUnsubmitted Done Reply Inline Actions _Float16 is supported with SSE2, but maybe we need to move `__m128h`, `__m256h` out of avx512fp16intrin.h pengfei: _Float16 is supported with SSE2, but maybe we need to move `__m128h`, `__m256h` out of…
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions Yes. This is a redundant FIXME. FreddyYe: Yes. This is a redundant FIXME.
				pengfeiUnsubmitted Done Reply Inline Actions I have moved FP16/BF16 vector types out of original header files. rGe0fb01e9 There should be no dependency to FP16 and BF16 feature now. pengfei: I have moved FP16/BF16 vector types out of original header files. rGe0fb01e9 There should be no…
				RKSimonUnsubmitted Done Reply Inline Actions Update to this? #if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \ (defined(__AVXNECONVERT__) && defined(__AVX512FP16__)) RKSimon: Update to this? ``` #if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\|…
	defined(__AVXVNNIINT8__)			defined(__AVXVNNIINT8__)
	#include <avxvnniint8intrin.h>			#include <avxvnniint8intrin.h>
	#endif			#endif

	#if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \			#if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \
				defined(__AVXNECONVERT__)
				#include <avxneconvertintrin.h>
				#endif

				#if !(defined(_MSC_VER) \|\| defined(__SCE__)) \|\| __has_feature(modules) \|\| \
	defined(__RDPID__)			defined(__RDPID__)
	/// Returns the value of the IA32_TSC_AUX MSR (0xc0000103).			/// Returns the value of the IA32_TSC_AUX MSR (0xc0000103).
	///			///
	/// \headerfile <immintrin.h>			/// \headerfile <immintrin.h>
	///			///
	/// This intrinsic corresponds to the <c> RDPID </c> instruction.			/// This intrinsic corresponds to the <c> RDPID </c> instruction.
	static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("rdpid")))			static __inline__ unsigned int __attribute__((__always_inline__, __nodebug__, __target__("rdpid")))
	_rdpid_u32(void) {			_rdpid_u32(void) {
	▲ Show 20 Lines • Show All 375 Lines • Show Last 20 Lines

clang/test/CodeGen/X86/avx512vlbf16-builtins.c

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	__m512bh test_mm512_mask_cvtne2ps2bf16(__m512bh C, __mmask32 U, __m512 A, __m512 B) {
// CHECK: @llvm.x86.avx512bf16.cvtne2ps2bf16.512		// CHECK: @llvm.x86.avx512bf16.cvtne2ps2bf16.512
// CHECK: select <32 x i1> %{{.}}, <32 x bfloat> %{{.}}, <32 x bfloat> %{{.*}}		// CHECK: select <32 x i1> %{{.}}, <32 x bfloat> %{{.}}, <32 x bfloat> %{{.*}}
// CHECK: ret <32 x bfloat> %{{.*}}		// CHECK: ret <32 x bfloat> %{{.*}}
return _mm512_mask_cvtne2ps_pbh(C, U, A, B);		return _mm512_mask_cvtne2ps_pbh(C, U, A, B);
}		}

__m128bh test_mm_cvtneps2bf16(__m128 A) {		__m128bh test_mm_cvtneps2bf16(__m128 A) {
// CHECK-LABEL: @test_mm_cvtneps2bf16		// CHECK-LABEL: @test_mm_cvtneps2bf16
// CHECK: @llvm.x86.avx512bf16.mask.cvtneps2bf16.128		// CHECK: @llvm.x86.vcvtneps2bf16128
// CHECK: ret <8 x bfloat> %{{.*}}		// CHECK: ret <8 x bfloat> %{{.*}}
return _mm_cvtneps_pbh(A);		return _mm_cvtneps_pbh(A);
}		}

__m128bh test_mm_mask_cvtneps2bf16(__m128bh C, __mmask8 U, __m128 A) {		__m128bh test_mm_mask_cvtneps2bf16(__m128bh C, __mmask8 U, __m128 A) {
// CHECK-LABEL: @test_mm_mask_cvtneps2bf16		// CHECK-LABEL: @test_mm_mask_cvtneps2bf16
// CHECK: @llvm.x86.avx512bf16.mask.cvtneps2bf16.		// CHECK: @llvm.x86.avx512bf16.mask.cvtneps2bf16.
// CHECK: ret <8 x bfloat> %{{.*}}		// CHECK: ret <8 x bfloat> %{{.*}}
return _mm_mask_cvtneps_pbh(C, U, A);		return _mm_mask_cvtneps_pbh(C, U, A);
}		}

__m128bh test_mm_maskz_cvtneps2bf16(__m128 A, __mmask8 U) {		__m128bh test_mm_maskz_cvtneps2bf16(__m128 A, __mmask8 U) {
// CHECK-LABEL: @test_mm_maskz_cvtneps2bf16		// CHECK-LABEL: @test_mm_maskz_cvtneps2bf16
// CHECK: @llvm.x86.avx512bf16.mask.cvtneps2bf16.128		// CHECK: @llvm.x86.avx512bf16.mask.cvtneps2bf16.128
// CHECK: ret <8 x bfloat> %{{.*}}		// CHECK: ret <8 x bfloat> %{{.*}}
return _mm_maskz_cvtneps_pbh(U, A);		return _mm_maskz_cvtneps_pbh(U, A);
}		}

__m128bh test_mm256_cvtneps2bf16(__m256 A) {		__m128bh test_mm256_cvtneps2bf16(__m256 A) {
// CHECK-LABEL: @test_mm256_cvtneps2bf16		// CHECK-LABEL: @test_mm256_cvtneps2bf16
// CHECK: @llvm.x86.avx512bf16.cvtneps2bf16.256		// CHECK: @llvm.x86.vcvtneps2bf16256
// CHECK: ret <8 x bfloat> %{{.*}}		// CHECK: ret <8 x bfloat> %{{.*}}
return _mm256_cvtneps_pbh(A);		return _mm256_cvtneps_pbh(A);
}		}

__m128bh test_mm256_mask_cvtneps2bf16(__m128bh C, __mmask8 U, __m256 A) {		__m128bh test_mm256_mask_cvtneps2bf16(__m128bh C, __mmask8 U, __m256 A) {
// CHECK-LABEL: @test_mm256_mask_cvtneps2bf16		// CHECK-LABEL: @test_mm256_mask_cvtneps2bf16
// CHECK: @llvm.x86.avx512bf16.cvtneps2bf16.256		// CHECK: @llvm.x86.avx512bf16.cvtneps2bf16.256
// CHECK: select <8 x i1> %{{.}}, <8 x bfloat> %{{.}}, <8 x bfloat> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x bfloat> %{{.}}, <8 x bfloat> %{{.*}}
▲ Show 20 Lines • Show All 115 Lines • Show Last 20 Lines

clang/test/CodeGen/X86/avxneconvert-builtins.c

This file was added.

				// RUN: %clang_cc1 %s -ffreestanding -triple=x86_64-unknown-unknown -target-feature +avx2 -target-feature +avxneconvert \
				// RUN: -emit-llvm -o - -Wall -Werror -pedantic -Wno-gnu-statement-expression \| FileCheck %s
				// RUN: %clang_cc1 %s -ffreestanding -triple=i386-unknown-unknown -target-feature +avx2 -target-feature +avxneconvert \
				RKSimonUnsubmitted Done Reply Inline Actions 32-bit test coverage? RKSimon: 32-bit test coverage?
				pengfeiUnsubmitted Done Reply Inline Actions This should be removed now. pengfei: This should be removed now.
				// RUN: -emit-llvm -o - -Wall -Werror -pedantic -Wno-gnu-statement-expression \| FileCheck %s

				#include <immintrin.h>
				#include <stddef.h>

				__m128 test_mm_bcstnebf16_ps(const void *__A) {
				// CHECK-LABEL: @test_mm_bcstnebf16_ps(
				// CHECK: call <4 x float> @llvm.x86.vbcstnebf162ps128(ptr %{{.*}})
				return _mm_bcstnebf16_ps(__A);
				}

				__m256 test_mm256_bcstnebf16_ps(const void *__A) {
				// CHECK-LABEL: @test_mm256_bcstnebf16_ps(
				// CHECK: call <8 x float> @llvm.x86.vbcstnebf162ps256(ptr %{{.*}})
				return _mm256_bcstnebf16_ps(__A);
				}

				__m128 test_mm_bcstnesh_ps(const void *__A) {
				// CHECK-LABEL: @test_mm_bcstnesh_ps(
				// CHECK: call <4 x float> @llvm.x86.vbcstnesh2ps128(ptr %{{.*}})
				return _mm_bcstnesh_ps(__A);
				}

				__m256 test_mm256_bcstnesh_ps(const void *__A) {
				// CHECK-LABEL: @test_mm256_bcstnesh_ps(
				// CHECK: call <8 x float> @llvm.x86.vbcstnesh2ps256(ptr %{{.*}})
				return _mm256_bcstnesh_ps(__A);
				}

				__m128 test_mm_cvtneebf16_ps(const __m128bh *__A) {
				// CHECK-LABEL: @test_mm_cvtneebf16_ps(
				// CHECK: call <4 x float> @llvm.x86.vcvtneebf162ps128(ptr %{{.*}})
				return _mm_cvtneebf16_ps(__A);
				}

				__m256 test_mm256_cvtneebf16_ps(const __m256bh *__A) {
				// CHECK-LABEL: @test_mm256_cvtneebf16_ps(
				// CHECK: call <8 x float> @llvm.x86.vcvtneebf162ps256(ptr %{{.*}})
				return _mm256_cvtneebf16_ps(__A);
				}

				__m128 test_mm_cvtneeph_ps(const __m128h *__A) {
				// CHECK-LABEL: @test_mm_cvtneeph_ps(
				// CHECK: call <4 x float> @llvm.x86.vcvtneeph2ps128(ptr %{{.*}})
				return _mm_cvtneeph_ps(__A);
				}

				__m256 test_mm256_cvtneeph_ps(const __m256h *__A) {
				// CHECK-LABEL: @test_mm256_cvtneeph_ps(
				// CHECK: call <8 x float> @llvm.x86.vcvtneeph2ps256(ptr %{{.*}})
				return _mm256_cvtneeph_ps(__A);
				}

				__m128 test_mm_cvtneobf16_ps(const __m128bh *__A) {
				// CHECK-LABEL: @test_mm_cvtneobf16_ps(
				// CHECK: call <4 x float> @llvm.x86.vcvtneobf162ps128(ptr %{{.*}})
				return _mm_cvtneobf16_ps(__A);
				}

				__m256 test_mm256_cvtneobf16_ps(const __m256bh *__A) {
				// CHECK-LABEL: @test_mm256_cvtneobf16_ps(
				// CHECK: call <8 x float> @llvm.x86.vcvtneobf162ps256(ptr %{{.*}})
				return _mm256_cvtneobf16_ps(__A);
				}

				__m128 test_mm_cvtneoph_ps(const __m128h *__A) {
				// CHECK-LABEL: @test_mm_cvtneoph_ps(
				// CHECK: call <4 x float> @llvm.x86.vcvtneoph2ps128(ptr %{{.*}})
				return _mm_cvtneoph_ps(__A);
				}

				__m256 test_mm256_cvtneoph_ps(const __m256h *__A) {
				// CHECK-LABEL: @test_mm256_cvtneoph_ps(
				// CHECK: call <8 x float> @llvm.x86.vcvtneoph2ps256(ptr %{{.*}})
				return _mm256_cvtneoph_ps(__A);
				}

				__m128bh test_mm_cvtneps_avx_pbh(__m128 __A) {
				// CHECK-LABEL: @test_mm_cvtneps_avx_pbh(
				// CHECK: call <8 x bfloat> @llvm.x86.vcvtneps2bf16128(<4 x float> %{{.*}})
				return _mm_cvtneps_avx_pbh(__A);
				}

				__m128bh test_mm256_cvtneps_avx_pbh(__m256 __A) {
				// CHECK-LABEL: @test_mm256_cvtneps_avx_pbh(
				// CHECK: call <8 x bfloat> @llvm.x86.vcvtneps2bf16256(<8 x float> %{{.*}})
				return _mm256_cvtneps_avx_pbh(__A);
				}

clang/test/CodeGen/attr-target-x86.c

	Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines
	// CHECK: qax{{.*}} #5			// CHECK: qax{{.*}} #5
	// CHECK: qq{{.*}} #6			// CHECK: qq{{.*}} #6
	// CHECK: lake{{.*}} #7			// CHECK: lake{{.*}} #7
	// CHECK: use_before_def{{.*}} #7			// CHECK: use_before_def{{.*}} #7
	// CHECK: walrus{{.*}} #8			// CHECK: walrus{{.*}} #8
	// CHECK: #0 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87" "tune-cpu"="i686"			// CHECK: #0 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87" "tune-cpu"="i686"
	// CHECK: #1 = {{.*}}"target-cpu"="ivybridge" "target-features"="+avx,+crc32,+cx16,+cx8,+f16c,+fsgsbase,+fxsr,+mmx,+pclmul,+popcnt,+rdrnd,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt"			// CHECK: #1 = {{.*}}"target-cpu"="ivybridge" "target-features"="+avx,+crc32,+cx16,+cx8,+f16c,+fsgsbase,+fxsr,+mmx,+pclmul,+popcnt,+rdrnd,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt"
	// CHECK-NOT: tune-cpu			// CHECK-NOT: tune-cpu
	// CHECK: #2 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87,-aes,-avx,-avx2,-avx512bf16,-avx512bitalg,-avx512bw,-avx512cd,-avx512dq,-avx512er,-avx512f,-avx512fp16,-avx512ifma,-avx512pf,-avx512vbmi,-avx512vbmi2,-avx512vl,-avx512vnni,-avx512vp2intersect,-avx512vpopcntdq,-avxifma,-avxvnni,-avxvnniint8,-f16c,-fma,-fma4,-gfni,-kl,-pclmul,-sha,-sse2,-sse3,-sse4.1,-sse4.2,-sse4a,-ssse3,-vaes,-vpclmulqdq,-widekl,-xop" "tune-cpu"="i686"			// CHECK: #2 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87,-aes,-avx,-avx2,-avx512bf16,-avx512bitalg,-avx512bw,-avx512cd,-avx512dq,-avx512er,-avx512f,-avx512fp16,-avx512ifma,-avx512pf,-avx512vbmi,-avx512vbmi2,-avx512vl,-avx512vnni,-avx512vp2intersect,-avx512vpopcntdq,-avxifma,-avxneconvert,-avxvnni,-avxvnniint8,-f16c,-fma,-fma4,-gfni,-kl,-pclmul,-sha,-sse2,-sse3,-sse4.1,-sse4.2,-sse4a,-ssse3,-vaes,-vpclmulqdq,-widekl,-xop" "tune-cpu"="i686"
	// CHECK: #3 = {{.*}}"target-cpu"="i686" "target-features"="+crc32,+cx8,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87" "tune-cpu"="i686"			// CHECK: #3 = {{.*}}"target-cpu"="i686" "target-features"="+crc32,+cx8,+mmx,+popcnt,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87" "tune-cpu"="i686"
	// CHECK: #4 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87,-avx,-avx2,-avx512bf16,-avx512bitalg,-avx512bw,-avx512cd,-avx512dq,-avx512er,-avx512f,-avx512fp16,-avx512ifma,-avx512pf,-avx512vbmi,-avx512vbmi2,-avx512vl,-avx512vnni,-avx512vp2intersect,-avx512vpopcntdq,-avxifma,-avxvnni,-avxvnniint8,-f16c,-fma,-fma4,-sse4.1,-sse4.2,-vaes,-vpclmulqdq,-xop" "tune-cpu"="i686"			// CHECK: #4 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87,-avx,-avx2,-avx512bf16,-avx512bitalg,-avx512bw,-avx512cd,-avx512dq,-avx512er,-avx512f,-avx512fp16,-avx512ifma,-avx512pf,-avx512vbmi,-avx512vbmi2,-avx512vl,-avx512vnni,-avx512vp2intersect,-avx512vpopcntdq,-avxifma,-avxneconvert,-avxvnni,-avxvnniint8,-f16c,-fma,-fma4,-sse4.1,-sse4.2,-vaes,-vpclmulqdq,-xop" "tune-cpu"="i686"
	// CHECK: #5 = {{.*}}"target-cpu"="ivybridge" "target-features"="+avx,+crc32,+cx16,+cx8,+f16c,+fsgsbase,+fxsr,+mmx,+pclmul,+popcnt,+rdrnd,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt,-aes,-vaes"			// CHECK: #5 = {{.*}}"target-cpu"="ivybridge" "target-features"="+avx,+crc32,+cx16,+cx8,+f16c,+fsgsbase,+fxsr,+mmx,+pclmul,+popcnt,+rdrnd,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt,-aes,-vaes"
	// CHECK-NOT: tune-cpu			// CHECK-NOT: tune-cpu
	// CHECK: #6 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87,-3dnow,-3dnowa,-mmx"			// CHECK: #6 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87,-3dnow,-3dnowa,-mmx"
	// CHECK: #7 = {{.*}}"target-cpu"="lakemont" "target-features"="+cx8,+mmx"			// CHECK: #7 = {{.*}}"target-cpu"="lakemont" "target-features"="+cx8,+mmx"
	// CHECK-NOT: tune-cpu			// CHECK-NOT: tune-cpu
	// CHECK: #8 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87" "tune-cpu"="sandybridge"			// CHECK: #8 = {{.*}}"target-cpu"="i686" "target-features"="+cx8,+x87" "tune-cpu"="sandybridge"

	// CHECK: "target-cpu"="x86-64-v2"			// CHECK: "target-cpu"="x86-64-v2"
	// CHECK-SAME: "target-features"="+crc32,+cx16,+cx8,+fxsr,+mmx,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87"			// CHECK-SAME: "target-features"="+crc32,+cx16,+cx8,+fxsr,+mmx,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87"
	// CHECK: "target-cpu"="x86-64-v3"			// CHECK: "target-cpu"="x86-64-v3"
	// CHECK-SAME: "target-features"="+avx,+avx2,+bmi,+bmi2,+crc32,+cx16,+cx8,+f16c,+fma,+fxsr,+lzcnt,+mmx,+movbe,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"			// CHECK-SAME: "target-features"="+avx,+avx2,+bmi,+bmi2,+crc32,+cx16,+cx8,+f16c,+fma,+fxsr,+lzcnt,+mmx,+movbe,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
	// CHECK: "target-cpu"="x86-64-v4"			// CHECK: "target-cpu"="x86-64-v4"
	// CHECK-SAME: "target-features"="+avx,+avx2,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512vl,+bmi,+bmi2,+crc32,+cx16,+cx8,+f16c,+fma,+fxsr,+lzcnt,+mmx,+movbe,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"			// CHECK-SAME: "target-features"="+avx,+avx2,+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512vl,+bmi,+bmi2,+crc32,+cx16,+cx8,+f16c,+fma,+fxsr,+lzcnt,+mmx,+movbe,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"

clang/test/Driver/x86-target-features.c

	Show First 20 Lines • Show All 331 Lines • ▼ Show 20 Lines
	// AVXIFMA: "-target-feature" "+avxifma"			// AVXIFMA: "-target-feature" "+avxifma"
	// NO-AVXIFMA: "-target-feature" "-avxifma"			// NO-AVXIFMA: "-target-feature" "-avxifma"

	// RUN: %clang --target=i386 -mavxvnniint8 %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=AVX-VNNIINT8 %s			// RUN: %clang --target=i386 -mavxvnniint8 %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=AVX-VNNIINT8 %s
	// RUN: %clang --target=i386 -mno-avxvnniint8 %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-AVX-VNNIINT8 %s			// RUN: %clang --target=i386 -mno-avxvnniint8 %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-AVX-VNNIINT8 %s
	// AVX-VNNIINT8: "-target-feature" "+avxvnniint8"			// AVX-VNNIINT8: "-target-feature" "+avxvnniint8"
	// NO-AVX-VNNIINT8: "-target-feature" "-avxvnniint8"			// NO-AVX-VNNIINT8: "-target-feature" "-avxvnniint8"

				// RUN: %clang --target=i386 -mavxneconvert %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=AVXNECONVERT %s
				// RUN: %clang --target=i386 -mno-avxneconvert %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-AVXNECONVERT %s
				// AVXNECONVERT: "-target-feature" "+avxneconvert"
				// NO-AVXNECONVERT: "-target-feature" "-avxneconvert"

	// RUN: %clang --target=i386 -march=i386 -mcrc32 %s -### 2>&1 \| FileCheck -check-prefix=CRC32 %s			// RUN: %clang --target=i386 -march=i386 -mcrc32 %s -### 2>&1 \| FileCheck -check-prefix=CRC32 %s
	// RUN: %clang --target=i386 -march=i386 -mno-crc32 %s -### 2>&1 \| FileCheck -check-prefix=NO-CRC32 %s			// RUN: %clang --target=i386 -march=i386 -mno-crc32 %s -### 2>&1 \| FileCheck -check-prefix=NO-CRC32 %s
	// CRC32: "-target-feature" "+crc32"			// CRC32: "-target-feature" "+crc32"
	// NO-CRC32: "-target-feature" "-crc32"			// NO-CRC32: "-target-feature" "-crc32"

	// RUN: %clang --target=i386 -march=i386 -mharden-sls=return %s -### -o %t.o 2>&1 \| FileCheck -check-prefixes=SLS-RET,NO-SLS %s			// RUN: %clang --target=i386 -march=i386 -mharden-sls=return %s -### -o %t.o 2>&1 \| FileCheck -check-prefixes=SLS-RET,NO-SLS %s
	// RUN: %clang --target=i386 -march=i386 -mharden-sls=indirect-jmp %s -### -o %t.o 2>&1 \| FileCheck -check-prefixes=SLS-IJMP,NO-SLS %s			// RUN: %clang --target=i386 -march=i386 -mharden-sls=indirect-jmp %s -### -o %t.o 2>&1 \| FileCheck -check-prefixes=SLS-IJMP,NO-SLS %s
	// RUN: %clang --target=i386 -march=i386 -mharden-sls=none -mharden-sls=all %s -### -o %t.o 2>&1 \| FileCheck -check-prefixes=SLS-IJMP,SLS-RET %s			// RUN: %clang --target=i386 -march=i386 -mharden-sls=none -mharden-sls=all %s -### -o %t.o 2>&1 \| FileCheck -check-prefixes=SLS-IJMP,SLS-RET %s
	// RUN: %clang --target=i386 -march=i386 -mharden-sls=all -mharden-sls=none %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-SLS %s			// RUN: %clang --target=i386 -march=i386 -mharden-sls=all -mharden-sls=none %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=NO-SLS %s
	// RUN: %clang --target=i386 -march=i386 -mharden-sls=return,indirect-jmp %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=BAD-SLS %s			// RUN: %clang --target=i386 -march=i386 -mharden-sls=return,indirect-jmp %s -### -o %t.o 2>&1 \| FileCheck -check-prefix=BAD-SLS %s
	// NO-SLS-NOT: "+harden-sls-			// NO-SLS-NOT: "+harden-sls-
	// SLS-RET-DAG: "-target-feature" "+harden-sls-ret"			// SLS-RET-DAG: "-target-feature" "+harden-sls-ret"
	// SLS-IJMP-DAG: "-target-feature" "+harden-sls-ijmp"			// SLS-IJMP-DAG: "-target-feature" "+harden-sls-ijmp"
	// NO-SLS-NOT: "+harden-sls-			// NO-SLS-NOT: "+harden-sls-
	// BAD-SLS: unsupported argument '{{[^']+}}' to option '-mharden-sls='			// BAD-SLS: unsupported argument '{{[^']+}}' to option '-mharden-sls='

clang/test/Preprocessor/x86_target_features.c

	Show First 20 Lines • Show All 584 Lines • ▼ Show 20 Lines
	// AVX512FP16NOAVX512VL-NOT: #define __AVX512FP16__ 1			// AVX512FP16NOAVX512VL-NOT: #define __AVX512FP16__ 1
	// AVX512FP16NOAVX512VL-NOT: #define __AVX512VL__ 1			// AVX512FP16NOAVX512VL-NOT: #define __AVX512VL__ 1

	// RUN: %clang -target i386-unknown-unknown -march=atom -mavx512fp16 -mno-avx512bw -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVX512FP16NOAVX512BW %s			// RUN: %clang -target i386-unknown-unknown -march=atom -mavx512fp16 -mno-avx512bw -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVX512FP16NOAVX512BW %s

	// AVX512FP16NOAVX512BW-NOT: #define __AVX512BW__ 1			// AVX512FP16NOAVX512BW-NOT: #define __AVX512BW__ 1
	// AVX512FP16NOAVX512BW-NOT: #define __AVX512FP16__ 1			// AVX512FP16NOAVX512BW-NOT: #define __AVX512FP16__ 1

	// RUN: %clang -target i386-unknown-unknown -march=atom -mavx512fp16 -mno-avx512dq -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVX512FP16NOAVX512DQ %s			// RUN: %clang -target i386-unknown-unknown -march=atom -mavx512fp16 -mno-avx512dq -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVX512FP16NOAVX512DQ %s

	// AVX512FP16NOAVX512DQ-NOT: #define __AVX512DQ__ 1			// AVX512FP16NOAVX512DQ-NOT: #define __AVX512DQ__ 1
	// AVX512FP16NOAVX512DQ-NOT: #define __AVX512FP16__ 1			// AVX512FP16NOAVX512DQ-NOT: #define __AVX512FP16__ 1

	// RUN: %clang -target x86_64-unknown-linux-gnu -march=atom -mcmpccxadd -x c -E -dM -o - %s \| FileCheck -check-prefix=CMPCCXADD %s			// RUN: %clang -target x86_64-unknown-linux-gnu -march=atom -mcmpccxadd -x c -E -dM -o - %s \| FileCheck -check-prefix=CMPCCXADD %s

				pengfeiUnsubmitted Done Reply Inline Actions Should we check `__AVX2__` like we did for AVXVNNI? pengfei: Should we check `__AVX2__` like we did for AVXVNNI?
	// CMPCCXADD: #define __CMPCCXADD__ 1			// CMPCCXADD: #define __CMPCCXADD__ 1

	// RUN: %clang -target x86_64-unknown-linux-gnu -march=atom -mno-cmpccxadd -x c -E -dM -o - %s \| FileCheck -check-prefix=NO-CMPCCXADD %s			// RUN: %clang -target x86_64-unknown-linux-gnu -march=atom -mno-cmpccxadd -x c -E -dM -o - %s \| FileCheck -check-prefix=NO-CMPCCXADD %s

	// NO-CMPCCXADD-NOT: #define __CMPCCXADD__ 1			// NO-CMPCCXADD-NOT: #define __CMPCCXADD__ 1
	// RUN: %clang -target i386-unknown-unknown -march=atom -mavxifma -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVXIFMA %s			// RUN: %clang -target i386-unknown-unknown -march=atom -mavxifma -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVXIFMA %s

	// AVXIFMA: #define __AVX2__ 1			// AVXIFMA: #define __AVX2__ 1
	Show All 21 Lines

	// NOAVXVNNIINT8-NOT: #define __AVXVNNIINT8__ 1			// NOAVXVNNIINT8-NOT: #define __AVXVNNIINT8__ 1

	// RUN: %clang -target i386-unknown-unknown -march=atom -mavxvnniint8 -mno-avx2 -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVXVNNIINT8NOAVX2 %s			// RUN: %clang -target i386-unknown-unknown -march=atom -mavxvnniint8 -mno-avx2 -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVXVNNIINT8NOAVX2 %s

	// AVXVNNIINT8NOAVX2-NOT: #define __AVX2__ 1			// AVXVNNIINT8NOAVX2-NOT: #define __AVX2__ 1
	// AVXVNNIINT8NOAVX2-NOT: #define __AVXVNNIINT8__ 1			// AVXVNNIINT8NOAVX2-NOT: #define __AVXVNNIINT8__ 1

				// RUN: %clang -target i386-unknown-unknown -march=atom -mavxneconvert -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVXNECONVERT %s

				// AVXNECONVERT: #define __AVX2__ 1
				// AVXNECONVERT: #define __AVXNECONVERT__ 1

				// RUN: %clang -target i386-unknown-unknown -march=atom -mno-avxneconvert -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=NOAVXNECONVERT %s

				// NOAVXNECONVERT-NOT: #define __AVXNECONVERT__ 1

				// RUN: %clang -target i386-unknown-unknown -march=atom -mavxneconvert -mno-avx2 -x c -E -dM -o - %s \| FileCheck -match-full-lines --check-prefix=AVXNECONVERTNOAVX2 %s

				// AVXNECONVERTNOAVX2-NOT: #define __AVX2__ 1
				// AVXNECONVERTNOAVX2-NOT: #define __AVXNECONVERT__ 1

	// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mcrc32 -x c -E -dM -o - %s \| FileCheck -check-prefix=CRC32 %s			// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mcrc32 -x c -E -dM -o - %s \| FileCheck -check-prefix=CRC32 %s

	// CRC32: #define __CRC32__ 1			// CRC32: #define __CRC32__ 1

	// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-crc32 -x c -E -dM -o - %s \| FileCheck -check-prefix=NOCRC32 %s			// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-crc32 -x c -E -dM -o - %s \| FileCheck -check-prefix=NOCRC32 %s

	// NOCRC32-NOT: #define __CRC32__ 1			// NOCRC32-NOT: #define __CRC32__ 1

	// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mrdpru -x c -E -dM -o - %s \| FileCheck -check-prefix=RDPRU %s			// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mrdpru -x c -E -dM -o - %s \| FileCheck -check-prefix=RDPRU %s

	// RDPRU: #define __RDPRU__ 1			// RDPRU: #define __RDPRU__ 1

	// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-rdpru -x c -E -dM -o - %s \| FileCheck -check-prefix=NORDPRU %s			// RUN: %clang -target i386-unknown-linux-gnu -march=i386 -mno-rdpru -x c -E -dM -o - %s \| FileCheck -check-prefix=NORDPRU %s

	// NORDPRU-NOT: #define __RDPRU__ 1			// NORDPRU-NOT: #define __RDPRU__ 1

llvm/docs/ReleaseNotes.rst

	Show First 20 Lines • Show All 132 Lines • ▼ Show 20 Lines

	* For MinGW, generate embedded ``-exclude-symbols:`` directives for symbols			* For MinGW, generate embedded ``-exclude-symbols:`` directives for symbols
	with hidden visibility, omitting them from automatic export of all symbols.			with hidden visibility, omitting them from automatic export of all symbols.
	This roughly makes hidden visibility work like it does for other object			This roughly makes hidden visibility work like it does for other object
	file formats.			file formats.

	Changes to the X86 Backend			Changes to the X86 Backend
	--------------------------			--------------------------
	* Support ISA of ``AVX-IFMA``.

	* Add support for the ``RDMSRLIST and WRMSRLIST`` instructions.			* Add support for the ``RDMSRLIST and WRMSRLIST`` instructions.
	* Add support for the ``WRMSRNS`` instruction.			* Add support for the ``WRMSRNS`` instruction.
	* Support ISA of ``AMX-FP16`` which contains ``tdpfp16ps`` instruction.			* Support ISA of ``AMX-FP16`` which contains ``tdpfp16ps`` instruction.
	* Support ISA of ``CMPCCXADD``.			* Support ISA of ``CMPCCXADD``.
				* Support ISA of ``AVX-IFMA``.
	* Support ISA of ``AVX-VNNI-INT8``.			* Support ISA of ``AVX-VNNI-INT8``.
				* Support ISA of ``AVX-NE-CONVERT``.

	Changes to the OCaml bindings			Changes to the OCaml bindings
	-----------------------------			-----------------------------


	Changes to the C API			Changes to the C API
	--------------------			--------------------

	▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

llvm/include/llvm/IR/IntrinsicsX86.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 5,228 Lines • ▼ Show 20 Lines
	}			}
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	let TargetPrefix = "x86" in {			let TargetPrefix = "x86" in {
	// AMX_FP16 - Intel FP16 AMX extensions			// AMX_FP16 - Intel FP16 AMX extensions
	def int_x86_tdpfp16ps : ClangBuiltin<"__builtin_ia32_tdpfp16ps">,			def int_x86_tdpfp16ps : ClangBuiltin<"__builtin_ia32_tdpfp16ps">,
	Intrinsic<[], [llvm_i8_ty, llvm_i8_ty, llvm_i8_ty],			Intrinsic<[], [llvm_i8_ty, llvm_i8_ty, llvm_i8_ty],
	[ImmArg<ArgIndex<0>>,			[ImmArg<ArgIndex<0>>,
	ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>]>;			ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>]>;
				def int_x86_vbcstnebf162ps128 : ClangBuiltin<"__builtin_ia32_vbcstnebf162ps128">,
				Intrinsic<[llvm_v4f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vbcstnebf162ps256 : ClangBuiltin<"__builtin_ia32_vbcstnebf162ps256">,
				Intrinsic<[llvm_v8f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vbcstnesh2ps128 : ClangBuiltin<"__builtin_ia32_vbcstnesh2ps128">,
				Intrinsic<[llvm_v4f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vbcstnesh2ps256 : ClangBuiltin<"__builtin_ia32_vbcstnesh2ps256">,
				Intrinsic<[llvm_v8f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneebf162ps128 : ClangBuiltin<"__builtin_ia32_vcvtneebf162ps128">,
				Intrinsic<[llvm_v4f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneebf162ps256 : ClangBuiltin<"__builtin_ia32_vcvtneebf162ps256">,
				Intrinsic<[llvm_v8f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneeph2ps128 : ClangBuiltin<"__builtin_ia32_vcvtneeph2ps128">,
				Intrinsic<[llvm_v4f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneeph2ps256 : ClangBuiltin<"__builtin_ia32_vcvtneeph2ps256">,
				Intrinsic<[llvm_v8f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneobf162ps128 : ClangBuiltin<"__builtin_ia32_vcvtneobf162ps128">,
				Intrinsic<[llvm_v4f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneobf162ps256 : ClangBuiltin<"__builtin_ia32_vcvtneobf162ps256">,
				Intrinsic<[llvm_v8f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneoph2ps128 : ClangBuiltin<"__builtin_ia32_vcvtneoph2ps128">,
				Intrinsic<[llvm_v4f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneoph2ps256 : ClangBuiltin<"__builtin_ia32_vcvtneoph2ps256">,
				Intrinsic<[llvm_v8f32_ty], [llvm_ptr_ty], [IntrReadMem]>;
				def int_x86_vcvtneps2bf16128 : ClangBuiltin<"__builtin_ia32_vcvtneps2bf16128">,
				Intrinsic<[llvm_v8bf16_ty], [llvm_v4f32_ty], [ IntrNoMem ]>;
				def int_x86_vcvtneps2bf16256 : ClangBuiltin<"__builtin_ia32_vcvtneps2bf16256">,
				Intrinsic<[llvm_v8bf16_ty], [llvm_v8f32_ty], [ IntrNoMem ]>;
	}			}
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// RAO-INT intrinsics			// RAO-INT intrinsics
	let TargetPrefix = "x86" in {			let TargetPrefix = "x86" in {
	def int_x86_aadd32 : ClangBuiltin<"__builtin_ia32_aadd32">,			def int_x86_aadd32 : ClangBuiltin<"__builtin_ia32_aadd32">,
	Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty], []>;			Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty], []>;
	def int_x86_aadd64 : ClangBuiltin<"__builtin_ia32_aadd64">,			def int_x86_aadd64 : ClangBuiltin<"__builtin_ia32_aadd64">,
	Intrinsic<[], [llvm_ptr_ty, llvm_i64_ty], []>;			Intrinsic<[], [llvm_ptr_ty, llvm_i64_ty], []>;
	▲ Show 20 Lines • Show All 781 Lines • Show Last 20 Lines

llvm/include/llvm/Support/X86TargetParser.def

	Show First 20 Lines • Show All 197 Lines • ▼ Show 20 Lines
	X86_FEATURE (X87, "x87")			X86_FEATURE (X87, "x87")
	X86_FEATURE (XSAVE, "xsave")			X86_FEATURE (XSAVE, "xsave")
	X86_FEATURE (XSAVEC, "xsavec")			X86_FEATURE (XSAVEC, "xsavec")
	X86_FEATURE (XSAVEOPT, "xsaveopt")			X86_FEATURE (XSAVEOPT, "xsaveopt")
	X86_FEATURE (XSAVES, "xsaves")			X86_FEATURE (XSAVES, "xsaves")
	X86_FEATURE (HRESET, "hreset")			X86_FEATURE (HRESET, "hreset")
	X86_FEATURE (RAOINT, "raoint")			X86_FEATURE (RAOINT, "raoint")
	X86_FEATURE (AVX512FP16, "avx512fp16")			X86_FEATURE (AVX512FP16, "avx512fp16")
	X86_FEATURE (AMX_FP16, "amx-fp16")			X86_FEATURE (AMX_FP16, "amx-fp16")
				craig.topperUnsubmitted Done Reply Inline Actions Extra space before "avxneconvert" craig.topper: Extra space before "avxneconvert"
	X86_FEATURE (CMPCCXADD, "cmpccxadd")			X86_FEATURE (CMPCCXADD, "cmpccxadd")
				X86_FEATURE (AVXNECONVERT, "avxneconvert")
	X86_FEATURE (AVXVNNI, "avxvnni")			X86_FEATURE (AVXVNNI, "avxvnni")
	X86_FEATURE (AVXIFMA, "avxifma")			X86_FEATURE (AVXIFMA, "avxifma")
	X86_FEATURE (AVXVNNIINT8, "avxvnniint8")			X86_FEATURE (AVXVNNIINT8, "avxvnniint8")
	// These features aren't really CPU features, but the frontend can set them.			// These features aren't really CPU features, but the frontend can set them.
	X86_FEATURE (RETPOLINE_EXTERNAL_THUNK, "retpoline-external-thunk")			X86_FEATURE (RETPOLINE_EXTERNAL_THUNK, "retpoline-external-thunk")
	X86_FEATURE (RETPOLINE_INDIRECT_BRANCHES, "retpoline-indirect-branches")			X86_FEATURE (RETPOLINE_INDIRECT_BRANCHES, "retpoline-indirect-branches")
	X86_FEATURE (RETPOLINE_INDIRECT_CALLS, "retpoline-indirect-calls")			X86_FEATURE (RETPOLINE_INDIRECT_CALLS, "retpoline-indirect-calls")
	X86_FEATURE (LVI_CFI, "lvi-cfi")			X86_FEATURE (LVI_CFI, "lvi-cfi")
	▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

llvm/lib/Support/Host.cpp

Show First 20 Lines • Show All 1,807 Lines • ▼ Show 20 Lines	#endif
Features["raoint"] = HasLeaf7Subleaf1 && ((EAX >> 3) & 1);		Features["raoint"] = HasLeaf7Subleaf1 && ((EAX >> 3) & 1);
Features["avxvnni"] = HasLeaf7Subleaf1 && ((EAX >> 4) & 1) && HasAVXSave;		Features["avxvnni"] = HasLeaf7Subleaf1 && ((EAX >> 4) & 1) && HasAVXSave;
Features["avx512bf16"] = HasLeaf7Subleaf1 && ((EAX >> 5) & 1) && HasAVX512Save;		Features["avx512bf16"] = HasLeaf7Subleaf1 && ((EAX >> 5) & 1) && HasAVX512Save;
Features["amx-fp16"] = HasLeaf7Subleaf1 && ((EAX >> 21) & 1) && HasAMXSave;		Features["amx-fp16"] = HasLeaf7Subleaf1 && ((EAX >> 21) & 1) && HasAMXSave;
Features["cmpccxadd"] = HasLeaf7Subleaf1 && ((EAX >> 7) & 1);		Features["cmpccxadd"] = HasLeaf7Subleaf1 && ((EAX >> 7) & 1);
Features["hreset"] = HasLeaf7Subleaf1 && ((EAX >> 22) & 1);		Features["hreset"] = HasLeaf7Subleaf1 && ((EAX >> 22) & 1);
Features["avxifma"] = HasLeaf7Subleaf1 && ((EAX >> 23) & 1) && HasAVXSave;		Features["avxifma"] = HasLeaf7Subleaf1 && ((EAX >> 23) & 1) && HasAVXSave;
Features["avxvnniint8"] = HasLeaf7Subleaf1 && ((EDX >> 4) & 1) && HasAVXSave;		Features["avxvnniint8"] = HasLeaf7Subleaf1 && ((EDX >> 4) & 1) && HasAVXSave;
		Features["avxneconvert"] = HasLeaf7Subleaf1 && ((EDX >> 5) & 1) && HasAVXSave;
Features["prefetchi"] = HasLeaf7Subleaf1 && ((EDX >> 14) & 1);		Features["prefetchi"] = HasLeaf7Subleaf1 && ((EDX >> 14) & 1);

bool HasLeafD = MaxLevel >= 0xd &&		bool HasLeafD = MaxLevel >= 0xd &&
		pengfeiUnsubmitted Done Reply Inline Actions Move it ahead and remove the blank line. pengfei: Move it ahead and remove the blank line.
!getX86CpuIDAndInfoEx(0xd, 0x1, &EAX, &EBX, &ECX, &EDX);		!getX86CpuIDAndInfoEx(0xd, 0x1, &EAX, &EBX, &ECX, &EDX);

// Only enable XSAVE if OS has enabled support for saving YMM state.		// Only enable XSAVE if OS has enabled support for saving YMM state.
Features["xsaveopt"] = HasLeafD && ((EAX >> 0) & 1) && HasAVXSave;		Features["xsaveopt"] = HasLeafD && ((EAX >> 0) & 1) && HasAVXSave;
Features["xsavec"] = HasLeafD && ((EAX >> 1) & 1) && HasAVXSave;		Features["xsavec"] = HasLeafD && ((EAX >> 1) & 1) && HasAVXSave;
Features["xsaves"] = HasLeafD && ((EAX >> 3) & 1) && HasAVXSave;		Features["xsaves"] = HasLeafD && ((EAX >> 3) & 1) && HasAVXSave;

bool HasLeaf14 = MaxLevel >= 0x14 &&		bool HasLeaf14 = MaxLevel >= 0x14 &&
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

llvm/lib/Support/X86TargetParser.cpp

	Show First 20 Lines • Show All 576 Lines • ▼ Show 20 Lines

	// AMX Features			// AMX Features
	constexpr FeatureBitset ImpliedFeaturesAMX_TILE = {};			constexpr FeatureBitset ImpliedFeaturesAMX_TILE = {};
	constexpr FeatureBitset ImpliedFeaturesAMX_BF16 = FeatureAMX_TILE;			constexpr FeatureBitset ImpliedFeaturesAMX_BF16 = FeatureAMX_TILE;
	constexpr FeatureBitset ImpliedFeaturesAMX_FP16 = FeatureAMX_TILE;			constexpr FeatureBitset ImpliedFeaturesAMX_FP16 = FeatureAMX_TILE;
	constexpr FeatureBitset ImpliedFeaturesAMX_INT8 = FeatureAMX_TILE;			constexpr FeatureBitset ImpliedFeaturesAMX_INT8 = FeatureAMX_TILE;
	constexpr FeatureBitset ImpliedFeaturesHRESET = {};			constexpr FeatureBitset ImpliedFeaturesHRESET = {};

	constexpr FeatureBitset ImpliedFeaturesAVXVNNIINT8 = FeatureAVX2;
	constexpr FeatureBitset ImpliedFeaturesPREFETCHI = {};			constexpr FeatureBitset ImpliedFeaturesPREFETCHI = {};
	constexpr FeatureBitset ImpliedFeaturesCMPCCXADD = {};			constexpr FeatureBitset ImpliedFeaturesCMPCCXADD = {};
	constexpr FeatureBitset ImpliedFeaturesRAOINT = {};			constexpr FeatureBitset ImpliedFeaturesRAOINT = {};
				constexpr FeatureBitset ImpliedFeaturesAVXVNNIINT8 = FeatureAVX2;
	constexpr FeatureBitset ImpliedFeaturesAVXIFMA = FeatureAVX2;			constexpr FeatureBitset ImpliedFeaturesAVXIFMA = FeatureAVX2;
				constexpr FeatureBitset ImpliedFeaturesAVXNECONVERT = FeatureAVX2;
	constexpr FeatureBitset ImpliedFeaturesAVX512FP16 =			constexpr FeatureBitset ImpliedFeaturesAVX512FP16 =
	FeatureAVX512BW \| FeatureAVX512DQ \| FeatureAVX512VL;			FeatureAVX512BW \| FeatureAVX512DQ \| FeatureAVX512VL;
	// Key Locker Features			// Key Locker Features
	constexpr FeatureBitset ImpliedFeaturesKL = FeatureSSE2;			constexpr FeatureBitset ImpliedFeaturesKL = FeatureSSE2;
	constexpr FeatureBitset ImpliedFeaturesWIDEKL = FeatureKL;			constexpr FeatureBitset ImpliedFeaturesWIDEKL = FeatureKL;

	// AVXVNNI Features			// AVXVNNI Features
	constexpr FeatureBitset ImpliedFeaturesAVXVNNI = FeatureAVX2;			constexpr FeatureBitset ImpliedFeaturesAVXVNNI = FeatureAVX2;
	▲ Show 20 Lines • Show All 120 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86.td

	Show First 20 Lines • Show All 266 Lines • ▼ Show 20 Lines
	def FeatureAMXFP16 : SubtargetFeature<"amx-fp16", "HasAMXFP16", "true",			def FeatureAMXFP16 : SubtargetFeature<"amx-fp16", "HasAMXFP16", "true",
	"Support AMX amx-fp16 instructions",			"Support AMX amx-fp16 instructions",
	[FeatureAMXTILE]>;			[FeatureAMXTILE]>;
	def FeatureCMPCCXADD : SubtargetFeature<"cmpccxadd", "HasCMPCCXADD", "true",			def FeatureCMPCCXADD : SubtargetFeature<"cmpccxadd", "HasCMPCCXADD", "true",
	"Support CMPCCXADD instructions">;			"Support CMPCCXADD instructions">;
	def FeatureRAOINT : SubtargetFeature<"raoint", "HasRAOINT", "true",			def FeatureRAOINT : SubtargetFeature<"raoint", "HasRAOINT", "true",
	"Support RAO-INT instructions",			"Support RAO-INT instructions",
	[]>;			[]>;
				def FeatureAVXNECONVERT : SubtargetFeature<"avxneconvert", "HasAVXNECONVERT", "true",
				"Support AVX-NE-CONVERT instructions",
				[FeatureAVX2]>;
	def FeatureINVPCID : SubtargetFeature<"invpcid", "HasINVPCID", "true",			def FeatureINVPCID : SubtargetFeature<"invpcid", "HasINVPCID", "true",
	"Invalidate Process-Context Identifier">;			"Invalidate Process-Context Identifier">;
	def FeatureSGX : SubtargetFeature<"sgx", "HasSGX", "true",			def FeatureSGX : SubtargetFeature<"sgx", "HasSGX", "true",
	"Enable Software Guard Extensions">;			"Enable Software Guard Extensions">;
	def FeatureCLFLUSHOPT : SubtargetFeature<"clflushopt", "HasCLFLUSHOPT", "true",			def FeatureCLFLUSHOPT : SubtargetFeature<"clflushopt", "HasCLFLUSHOPT", "true",
	"Flush A Cache Line Optimized">;			"Flush A Cache Line Optimized">;
	def FeatureCLWB : SubtargetFeature<"clwb", "HasCLWB", "true",			def FeatureCLWB : SubtargetFeature<"clwb", "HasCLWB", "true",
	"Cache Line Write Back">;			"Cache Line Write Back">;
	▲ Show 20 Lines • Show All 1,395 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,172 Lines • ▼ Show 20 Lines

if (Subtarget.hasVLX()) {

setLoadExtAction(ISD::EXTLOAD, MVT::v4f32, MVT::v4f16, Legal);

// Need to custom widen these to prevent scalarization.

setOperationAction(ISD::LOAD, MVT::v4f16, Custom);

setOperationAction(ISD::STORE, MVT::v4f16, Custom);

}

if (!Subtarget.useSoftFloat() && Subtarget.hasBF16()) {

if (!Subtarget.useSoftFloat() &&

(Subtarget.hasAVXNECONVERT() || Subtarget.hasBF16())) {

addRegisterClass(MVT::v8bf16, &X86::VR128XRegClass);

addRegisterClass(MVT::v16bf16, &X86::VR256XRegClass);

addRegisterClass(MVT::v32bf16, &X86::VR512RegClass);

// We set the type action of bf16 to TypeSoftPromoteHalf, but we don't

// provide the method to promote BUILD_VECTOR. Set the operation action

// Custom to do the customization later.

setOperationAction(ISD::BUILD_VECTOR, MVT::bf16, Custom);

for (auto VT : { MVT::v8bf16, MVT::v16bf16, MVT::v32bf16 }) {

for (auto VT : {MVT::v8bf16, MVT::v16bf16}) {

setF16Action(VT, Expand);

setOperationAction(ISD::FADD, VT, Expand);

setOperationAction(ISD::FSUB, VT, Expand);

setOperationAction(ISD::FMUL, VT, Expand);

setOperationAction(ISD::FDIV, VT, Expand);

setOperationAction(ISD::BUILD_VECTOR, VT, Custom);

}

addLegalFPImmediate(APFloat::getZero(APFloat::BFloat()));

}

pengfeiUnsubmitted

Done

}

- if (!Subtarget.useSoftFloat() && Subtarget.hasBF16()) {

+ if (!Subtarget.useSoftFloat() &&

+ (Subtarget.hasAVXNECONVERT() || Subtarget.hasBF16())) {

addRegisterClass(MVT::v8bf16, &X86::VR128XRegClass);

addRegisterClass(MVT::v16bf16, &X86::VR256XRegClass);

- addRegisterClass(MVT::v32bf16, &X86::VR512RegClass);

// We set the type action of bf16 to TypeSoftPromoteHalf, but we don't

// provide the method to promote BUILD_VECTOR. Set the operation action

// Custom to do the customization later.

setOperationAction(ISD::BUILD_VECTOR, MVT::bf16, Custom);

- for (auto VT : { MVT::v8bf16, MVT::v16bf16, MVT::v32bf16 }) {

+ for (auto VT : {MVT::v8bf16, MVT::v16bf16}) {

setF16Action(VT, Expand);

setOperationAction(ISD::FADD, VT, Expand);

setOperationAction(ISD::FSUB, VT, Expand);

setOperationAction(ISD::FMUL, VT, Expand);

setOperationAction(ISD::FDIV, VT, Expand);

setOperationAction(ISD::BUILD_VECTOR, VT, Custom);

}

addLegalFPImmediate(APFloat::getZero(APFloat::BFloat()));

}

+ if (!Subtarget.useSoftFloat() && Subtarget.hasBF16()) {

+ addRegisterClass(MVT::v32bf16, &X86::VR512RegClass);

+ setF16Action(MVT::v32bf16, Expand);

+ setOperationAction(ISD::FADD, MVT::v32bf16, Expand);

+ setOperationAction(ISD::FSUB, MVT::v32bf16, Expand);

+ setOperationAction(ISD::FMUL, MVT::v32bf16, Expand);

+ setOperationAction(ISD::FDIV, MVT::v32bf16, Expand);

+ setOperationAction(ISD::BUILD_VECTOR, MVT::v32bf16, Custom);

+ }

if (!Subtarget.useSoftFloat() && Subtarget.hasVLX()) {

How about merge it here?

pengfei: How about merge it here?

if (!Subtarget.useSoftFloat() && Subtarget.hasBF16()) {

addRegisterClass(MVT::v32bf16, &X86::VR512RegClass);

setF16Action(MVT::v32bf16, Expand);

setOperationAction(ISD::FADD, MVT::v32bf16, Expand);

setOperationAction(ISD::FSUB, MVT::v32bf16, Expand);

setOperationAction(ISD::FMUL, MVT::v32bf16, Expand);

setOperationAction(ISD::FDIV, MVT::v32bf16, Expand);

setOperationAction(ISD::BUILD_VECTOR, MVT::v32bf16, Custom);

}

if (!Subtarget.useSoftFloat() && Subtarget.hasVLX()) {

setTruncStoreAction(MVT::v4i64, MVT::v4i8, Legal);

setTruncStoreAction(MVT::v4i64, MVT::v4i16, Legal);

setTruncStoreAction(MVT::v4i64, MVT::v4i32, Legal);

setTruncStoreAction(MVT::v8i32, MVT::v8i8, Legal);

setTruncStoreAction(MVT::v8i32, MVT::v8i16, Legal);

setTruncStoreAction(MVT::v2i64, MVT::v2i8, Legal);

▲ Show 20 Lines • Show All 54,906 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrAVX512.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 12,941 Lines • ▼ Show 20 Lines	def : Pat<(v8bf16 (X86cvtneps2bf16 (v4f32
(X86VBroadcastld32 addr:$src)))),		(X86VBroadcastld32 addr:$src)))),
(VCVTNEPS2BF16Z128rmb addr:$src)>;		(VCVTNEPS2BF16Z128rmb addr:$src)>;
def : Pat<(X86mcvtneps2bf16 (v4f32 (X86VBroadcastld32 addr:$src)),		def : Pat<(X86mcvtneps2bf16 (v4f32 (X86VBroadcastld32 addr:$src)),
(v8bf16 VR128X:$src0), VK4WM:$mask),		(v8bf16 VR128X:$src0), VK4WM:$mask),
(VCVTNEPS2BF16Z128rmbk VR128X:$src0, VK4WM:$mask, addr:$src)>;		(VCVTNEPS2BF16Z128rmbk VR128X:$src0, VK4WM:$mask, addr:$src)>;
def : Pat<(X86mcvtneps2bf16 (v4f32 (X86VBroadcastld32 addr:$src)),		def : Pat<(X86mcvtneps2bf16 (v4f32 (X86VBroadcastld32 addr:$src)),
v8bf16x_info.ImmAllZerosV, VK4WM:$mask),		v8bf16x_info.ImmAllZerosV, VK4WM:$mask),
(VCVTNEPS2BF16Z128rmbkz VK4WM:$mask, addr:$src)>;		(VCVTNEPS2BF16Z128rmbkz VK4WM:$mask, addr:$src)>;

		def : Pat<(v8bf16 (int_x86_vcvtneps2bf16128 (v4f32 VR128X:$src))),
		(VCVTNEPS2BF16Z128rr VR128X:$src)>;
		def : Pat<(v8bf16 (int_x86_vcvtneps2bf16128 (loadv4f32 addr:$src))),
		(VCVTNEPS2BF16Z128rm addr:$src)>;

		def : Pat<(v8bf16 (int_x86_vcvtneps2bf16256 (v8f32 VR256X:$src))),
		(VCVTNEPS2BF16Z256rr VR256X:$src)>;
		def : Pat<(v8bf16 (int_x86_vcvtneps2bf16256 (loadv8f32 addr:$src))),
		(VCVTNEPS2BF16Z256rm addr:$src)>;
}		}

let Constraints = "$src1 = $dst" in {		let Constraints = "$src1 = $dst" in {
multiclass avx512_dpbf16ps_rm<bits<8> opc, string OpcodeStr, SDNode OpNode,		multiclass avx512_dpbf16ps_rm<bits<8> opc, string OpcodeStr, SDNode OpNode,
X86FoldableSchedWrite sched,		X86FoldableSchedWrite sched,
X86VectorVTInfo _, X86VectorVTInfo src_v> {		X86VectorVTInfo _, X86VectorVTInfo src_v> {
defm r: AVX512_maskable_3src<opc, MRMSrcReg, _, (outs _.RC:$dst),		defm r: AVX512_maskable_3src<opc, MRMSrcReg, _, (outs _.RC:$dst),
(ins src_v.RC:$src2, src_v.RC:$src3),		(ins src_v.RC:$src2, src_v.RC:$src3),
▲ Show 20 Lines • Show All 771 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrInfo.td

	Show First 20 Lines • Show All 987 Lines • ▼ Show 20 Lines
	def HasWAITPKG : Predicate<"Subtarget->hasWAITPKG()">;			def HasWAITPKG : Predicate<"Subtarget->hasWAITPKG()">;
	def HasINVPCID : Predicate<"Subtarget->hasINVPCID()">;			def HasINVPCID : Predicate<"Subtarget->hasINVPCID()">;
	def HasCX8 : Predicate<"Subtarget->hasCX8()">;			def HasCX8 : Predicate<"Subtarget->hasCX8()">;
	def HasCX16 : Predicate<"Subtarget->hasCX16()">;			def HasCX16 : Predicate<"Subtarget->hasCX16()">;
	def HasPCONFIG : Predicate<"Subtarget->hasPCONFIG()">;			def HasPCONFIG : Predicate<"Subtarget->hasPCONFIG()">;
	def HasENQCMD : Predicate<"Subtarget->hasENQCMD()">;			def HasENQCMD : Predicate<"Subtarget->hasENQCMD()">;
	def HasAMXFP16 : Predicate<"Subtarget->hasAMXFP16()">;			def HasAMXFP16 : Predicate<"Subtarget->hasAMXFP16()">;
	def HasCMPCCXADD : Predicate<"Subtarget->hasCMPCCXADD()">;			def HasCMPCCXADD : Predicate<"Subtarget->hasCMPCCXADD()">;
				def HasAVXNECONVERT : Predicate<"Subtarget->hasAVXNECONVERT()">;
	def HasKL : Predicate<"Subtarget->hasKL()">;			def HasKL : Predicate<"Subtarget->hasKL()">;
	def HasRAOINT : Predicate<"Subtarget->hasRAOINT()">;			def HasRAOINT : Predicate<"Subtarget->hasRAOINT()">;
	def HasWIDEKL : Predicate<"Subtarget->hasWIDEKL()">;			def HasWIDEKL : Predicate<"Subtarget->hasWIDEKL()">;
	def HasHRESET : Predicate<"Subtarget->hasHRESET()">;			def HasHRESET : Predicate<"Subtarget->hasHRESET()">;
	def HasSERIALIZE : Predicate<"Subtarget->hasSERIALIZE()">;			def HasSERIALIZE : Predicate<"Subtarget->hasSERIALIZE()">;
	def HasTSXLDTRK : Predicate<"Subtarget->hasTSXLDTRK()">;			def HasTSXLDTRK : Predicate<"Subtarget->hasTSXLDTRK()">;
	def HasAMXTILE : Predicate<"Subtarget->hasAMXTILE()">;			def HasAMXTILE : Predicate<"Subtarget->hasAMXTILE()">;
	def HasAMXBF16 : Predicate<"Subtarget->hasAMXBF16()">;			def HasAMXBF16 : Predicate<"Subtarget->hasAMXBF16()">;
	▲ Show 20 Lines • Show All 2,816 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrSSE.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,117 Lines • ▼ Show 20 Lines
// GF2P8AFFINEINVQB, GF2P8AFFINEQB		// GF2P8AFFINEINVQB, GF2P8AFFINEQB
let isCommutable = 0 in {		let isCommutable = 0 in {
defm GF2P8AFFINEINVQB : GF2P8AFFINE_common<0xCF, "gf2p8affineinvqb",		defm GF2P8AFFINEINVQB : GF2P8AFFINE_common<0xCF, "gf2p8affineinvqb",
X86GF2P8affineinvqb>, TAPD;		X86GF2P8affineinvqb>, TAPD;
defm GF2P8AFFINEQB : GF2P8AFFINE_common<0xCE, "gf2p8affineqb",		defm GF2P8AFFINEQB : GF2P8AFFINE_common<0xCE, "gf2p8affineqb",
X86GF2P8affineqb>, TAPD;		X86GF2P8affineqb>, TAPD;
}		}

		// AVX-IFMA
let Predicates = [HasAVXIFMA, NoVLX_Or_NoIFMA], Constraints = "$src1 = $dst",		let Predicates = [HasAVXIFMA, NoVLX_Or_NoIFMA], Constraints = "$src1 = $dst",
checkVEXPredicate = 1 in		checkVEXPredicate = 1 in
multiclass avx_ifma_rm<bits<8> opc, string OpcodeStr, SDNode OpNode> {		multiclass avx_ifma_rm<bits<8> opc, string OpcodeStr, SDNode OpNode> {
// NOTE: The SDNode have the multiply operands first with the add last.		// NOTE: The SDNode have the multiply operands first with the add last.
// This enables commuted load patterns to be autogenerated by tablegen.		// This enables commuted load patterns to be autogenerated by tablegen.
let isCommutable = 1 in {		let isCommutable = 1 in {
def rr : AVX8I<opc, MRMSrcReg, (outs VR128:$dst),		def rr : AVX8I<opc, MRMSrcReg, (outs VR128:$dst),
(ins VR128:$src1, VR128:$src2, VR128:$src3),		(ins VR128:$src1, VR128:$src2, VR128:$src3),
Show All 22 Lines	def Yrm : AVX8I<opc, MRMSrcMem, (outs VR256:$dst),
[(set VR256:$dst, (v4i64 (OpNode VR256:$src2,		[(set VR256:$dst, (v4i64 (OpNode VR256:$src2,
(loadv4i64 addr:$src3), VR256:$src1)))]>,		(loadv4i64 addr:$src3), VR256:$src1)))]>,
VEX_4V, VEX_L, Sched<[SchedWriteVecIMul.YMM]>;		VEX_4V, VEX_L, Sched<[SchedWriteVecIMul.YMM]>;
}		}

defm VPMADD52HUQ : avx_ifma_rm<0xb5, "vpmadd52huq", x86vpmadd52h>, VEX_W, ExplicitVEXPrefix;		defm VPMADD52HUQ : avx_ifma_rm<0xb5, "vpmadd52huq", x86vpmadd52h>, VEX_W, ExplicitVEXPrefix;
defm VPMADD52LUQ : avx_ifma_rm<0xb4, "vpmadd52luq", x86vpmadd52l>, VEX_W, ExplicitVEXPrefix;		defm VPMADD52LUQ : avx_ifma_rm<0xb4, "vpmadd52luq", x86vpmadd52l>, VEX_W, ExplicitVEXPrefix;

		// AVX-VNNI-INT8
let Constraints = "$src1 = $dst" in		let Constraints = "$src1 = $dst" in
multiclass avx_dotprod_rm<bits<8> Opc, string OpcodeStr, ValueType OpVT,		multiclass avx_dotprod_rm<bits<8> Opc, string OpcodeStr, ValueType OpVT,
RegisterClass RC, PatFrag MemOpFrag,		RegisterClass RC, PatFrag MemOpFrag,
X86MemOperand X86memop, SDNode OpNode,		X86MemOperand X86memop, SDNode OpNode,
X86FoldableSchedWrite Sched,		X86FoldableSchedWrite Sched,
bit IsCommutable> {		bit IsCommutable> {
let isCommutable = IsCommutable in		let isCommutable = IsCommutable in
def rr : I<Opc, MRMSrcReg, (outs RC:$dst),		def rr : I<Opc, MRMSrcReg, (outs RC:$dst),
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	defm VPDPBSUDY : avx_dotprod_rm<0x50,"vpdpbsud", v8i32, VR256, loadv8i32,
0>, VEX_L, T8XS;		0>, VEX_L, T8XS;
defm VPDPBSUDS : avx_dotprod_rm<0x51,"vpdpbsuds", v4i32, VR128, loadv4i32,		defm VPDPBSUDS : avx_dotprod_rm<0x51,"vpdpbsuds", v4i32, VR128, loadv4i32,
i128mem, X86vpdpbsuds, SchedWriteVecIMul.XMM,		i128mem, X86vpdpbsuds, SchedWriteVecIMul.XMM,
0>, T8XS;		0>, T8XS;
defm VPDPBSUDSY : avx_dotprod_rm<0x51,"vpdpbsuds", v8i32, VR256, loadv8i32,		defm VPDPBSUDSY : avx_dotprod_rm<0x51,"vpdpbsuds", v8i32, VR256, loadv8i32,
i256mem, X86vpdpbsuds, SchedWriteVecIMul.YMM,		i256mem, X86vpdpbsuds, SchedWriteVecIMul.YMM,
0>, VEX_L, T8XS;		0>, VEX_L, T8XS;
}		}

		// AVX-NE-CONVERT
		multiclass AVX_NE_CONVERT_BASE<bits<8> Opcode, string OpcodeStr,
		X86MemOperand MemOp128, X86MemOperand MemOp256> {
		def rm : I<Opcode, MRMSrcMem, (outs VR128:$dst), (ins MemOp128:$src),
		!strconcat(OpcodeStr, "\t{$src, $dst\|$dst, $src}"),
		[(set VR128:$dst,
		(!cast<Intrinsic>("int_x86_"#OpcodeStr#"128") addr:$src))]>,
		Sched<[WriteCvtPH2PS]>, VEX;
		def Yrm : I<Opcode, MRMSrcMem, (outs VR256:$dst), (ins MemOp256:$src),
		!strconcat(OpcodeStr, "\t{$src, $dst\|$dst, $src}"),
		[(set VR256:$dst,
		(!cast<Intrinsic>("int_x86_"#OpcodeStr#"256") addr:$src))]>,
		Sched<[WriteCvtPH2PSY]>, VEX, VEX_L;
		}

		multiclass VCVTNEPS2BF16_BASE {
		def rr : I<0x72, MRMSrcReg, (outs VR128:$dst), (ins VR128:$src),
		"vcvtneps2bf16\t{$src, $dst\|$dst, $src}",
		[(set VR128:$dst, (int_x86_vcvtneps2bf16128 VR128:$src))]>,
		Sched<[WriteCvtPH2PS]>;
		def rm : I<0x72, MRMSrcMem, (outs VR128:$dst), (ins f128mem:$src),
		"vcvtneps2bf16{x}\t{$src, $dst\|$dst, $src}",
		[(set VR128:$dst, (int_x86_vcvtneps2bf16128 (loadv4f32 addr:$src)))]>,
		Sched<[WriteCvtPH2PS]>;
		def Yrr : I<0x72, MRMSrcReg, (outs VR128:$dst), (ins VR256:$src),
		"vcvtneps2bf16\t{$src, $dst\|$dst, $src}",
		[(set VR128:$dst, (int_x86_vcvtneps2bf16256 VR256:$src))]>,
		Sched<[WriteCvtPH2PSY]>, VEX_L;
		def Yrm : I<0x72, MRMSrcMem, (outs VR128:$dst), (ins f256mem:$src),
		"vcvtneps2bf16{y}\t{$src, $dst\|$dst, $src}",
		[(set VR128:$dst, (int_x86_vcvtneps2bf16256 (loadv8f32 addr:$src)))]>,
		Sched<[WriteCvtPH2PSY]>, VEX_L;
		}

		let Predicates = [HasAVXNECONVERT] in {
		defm VBCSTNEBF162PS : AVX_NE_CONVERT_BASE<0xb1, "vbcstnebf162ps", f16mem,
		f16mem>, T8XS;
		pengfeiUnsubmitted Done Reply Inline Actions This can be f16 mem now. pengfei: This can be f16 mem now.
		defm VBCSTNESH2PS : AVX_NE_CONVERT_BASE<0xb1, "vbcstnesh2ps", f16mem, f16mem>,
		T8PD;
		defm VCVTNEEBF162PS : AVX_NE_CONVERT_BASE<0xb0, "vcvtneebf162ps", f128mem,
		f256mem>, T8XS;
		pengfeiUnsubmitted Done Reply Inline Actions f128mem, f256mem pengfei: f128mem, f256mem
		defm VCVTNEEPH2PS : AVX_NE_CONVERT_BASE<0xb0, "vcvtneeph2ps", f128mem,
		f256mem>, T8PD;
		defm VCVTNEOBF162PS : AVX_NE_CONVERT_BASE<0xb0, "vcvtneobf162ps", f128mem,
		f256mem>, T8XD;
		pengfeiUnsubmitted Done Reply Inline Actions ditto. pengfei: ditto.
		defm VCVTNEOPH2PS : AVX_NE_CONVERT_BASE<0xb0, "vcvtneoph2ps", f128mem,
		f256mem>, T8PS;
		let checkVEXPredicate = 1 in
		defm VCVTNEPS2BF16 : VCVTNEPS2BF16_BASE, VEX, T8XS, ExplicitVEXPrefix;
		}

		def : InstAlias<"vcvtneps2bf16x\t{$src, $dst\|$dst, $src}",
		(VCVTNEPS2BF16rr VR128:$dst, VR128:$src), 0, "att">;
		def : InstAlias<"vcvtneps2bf16y\t{$src, $dst\|$dst, $src}",
		(VCVTNEPS2BF16Yrr VR128:$dst, VR256:$src), 0, "att">;

llvm/test/CodeGen/X86/avxneconvert-intrinsics-shared.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avxneconvert,+avx512bf16,+avx512vl \| FileCheck %s --check-prefix=AVX512BF16-COMMON
				; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avxneconvert,+avx512bf16,+avx512vl \| FileCheck %s --check-prefix=AVX512BF16-COMMON
				pengfeiUnsubmitted Not Done Reply Inline Actions Remove `-O0` pengfei: Remove `-O0`
				; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avx512bf16,+avx512vl \| FileCheck %s --check-prefix=AVX512BF16
				; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avx512bf16,+avx512vl \| FileCheck %s --check-prefix=AVX512BF16

				define <8 x bfloat> @test_int_x86_vcvtneps2bf16128(<4 x float> %A) {
				; AVX512BF16-COMMON-LABEL: test_int_x86_vcvtneps2bf16128:
				; AVX512BF16-COMMON: # %bb.0:
				; AVX512BF16-COMMON-NEXT: {vex} vcvtneps2bf16 %xmm0, %xmm0 # encoding: [0xc4,0xe2,0x7a,0x72,0xc0]
				; AVX512BF16-COMMON-NEXT: # kill: def $xmm1 killed $xmm0
				; AVX512BF16-COMMON-NEXT: ret{{[l\|q]}} # encoding: [0xc3]
				;
				; AVX512BF16-LABEL: test_int_x86_vcvtneps2bf16128:
				; AVX512BF16: # %bb.0:
				; AVX512BF16-NEXT: vcvtneps2bf16 %xmm0, %xmm0 # encoding: [0x62,0xf2,0x7e,0x08,0x72,0xc0]
				; AVX512BF16-NEXT: ret{{[l\|q]}} # encoding: [0xc3]
				%ret = call <8 x bfloat> @llvm.x86.vcvtneps2bf16128(<4 x float> %A)
				ret <8 x bfloat> %ret
				}
				declare <8 x bfloat> @llvm.x86.vcvtneps2bf16128(<4 x float> %A)

				define <8 x bfloat> @test_int_x86_vcvtneps2bf16256(<8 x float> %A) {
				; AVX512BF16-COMMON-LABEL: test_int_x86_vcvtneps2bf16256:
				; AVX512BF16-COMMON: # %bb.0:
				; AVX512BF16-COMMON-NEXT: {vex} vcvtneps2bf16 %ymm0, %xmm0 # encoding: [0xc4,0xe2,0x7e,0x72,0xc0]
				; AVX512BF16-COMMON-NEXT: # kill: def $xmm1 killed $xmm0
				; AVX512BF16-COMMON-NEXT: vzeroupper # encoding: [0xc5,0xf8,0x77]
				; AVX512BF16-COMMON-NEXT: ret{{[l\|q]}} # encoding: [0xc3]
				;
				; AVX512BF16-LABEL: test_int_x86_vcvtneps2bf16256:
				; AVX512BF16: # %bb.0:
				; AVX512BF16-NEXT: vcvtneps2bf16 %ymm0, %xmm0 # encoding: [0x62,0xf2,0x7e,0x28,0x72,0xc0]
				; AVX512BF16-NEXT: vzeroupper # encoding: [0xc5,0xf8,0x77]
				; AVX512BF16-NEXT: ret{{[l\|q]}} # encoding: [0xc3]
				%ret = call <8 x bfloat> @llvm.x86.vcvtneps2bf16256(<8 x float> %A)
				ret <8 x bfloat> %ret
				}
				declare <8 x bfloat> @llvm.x86.vcvtneps2bf16256(<8 x float> %A)

llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avxneconvert \| FileCheck %s --check-prefixes=CHECK,X64
				; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avxneconvert \| FileCheck %s --check-prefixes=CHECK,X86
				pengfeiUnsubmitted Done Reply Inline Actions Do we have real dependency to FP16? pengfei: Do we have real dependency to FP16?
				pengfeiUnsubmitted Done Reply Inline Actions --check-prefixes=CHECK,X64 pengfei: --check-prefixes=CHECK,X64
				pengfeiUnsubmitted Not Done Reply Inline Actions ditto. pengfei: ditto.

				pengfeiUnsubmitted Done Reply Inline Actions ditto. pengfei: ditto.
				pengfeiUnsubmitted Done Reply Inline Actions --check-prefixes=CHECK,X86 pengfei: --check-prefixes=CHECK,X86
				define <4 x float> @test_int_x86_vbcstnebf162ps128(i8* %A) {
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions Need to add `+avx512bf16,+avx512vl` tests for shared builtin intrinsic. I just found it crashed for lacking new patterns for avx512bf16. I'll update ASAP. FreddyYe: Need to add `+avx512bf16,+avx512vl` tests for shared builtin intrinsic. I just found it crashed…
				; X64-LABEL: test_int_x86_vbcstnebf162ps128:
				; X64: # %bb.0:
				; X64-NEXT: vbcstnebf162ps (%rdi), %xmm0 # encoding: [0xc4,0xe2,0x7a,0xb1,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vbcstnebf162ps128:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vbcstnebf162ps (%eax), %xmm0 # encoding: [0xc4,0xe2,0x7a,0xb1,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <4 x float> @llvm.x86.vbcstnebf162ps128(i8* %A)
				ret <4 x float> %ret
				}
				declare <4 x float> @llvm.x86.vbcstnebf162ps128(i8* %A)

				define <8 x float> @test_int_x86_vbcstnebf162ps256(i8* %A) {
				; X64-LABEL: test_int_x86_vbcstnebf162ps256:
				; X64: # %bb.0:
				; X64-NEXT: vbcstnebf162ps (%rdi), %ymm0 # encoding: [0xc4,0xe2,0x7e,0xb1,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vbcstnebf162ps256:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vbcstnebf162ps (%eax), %ymm0 # encoding: [0xc4,0xe2,0x7e,0xb1,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <8 x float> @llvm.x86.vbcstnebf162ps256(i8* %A)
				ret <8 x float> %ret
				}
				declare <8 x float> @llvm.x86.vbcstnebf162ps256(i8* %A)

				define <4 x float> @test_int_x86_vbcstnesh2ps128(i8* %A) {
				; X64-LABEL: test_int_x86_vbcstnesh2ps128:
				; X64: # %bb.0:
				; X64-NEXT: vbcstnesh2ps (%rdi), %xmm0 # encoding: [0xc4,0xe2,0x79,0xb1,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vbcstnesh2ps128:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vbcstnesh2ps (%eax), %xmm0 # encoding: [0xc4,0xe2,0x79,0xb1,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <4 x float> @llvm.x86.vbcstnesh2ps128(i8* %A)
				ret <4 x float> %ret
				}
				declare <4 x float> @llvm.x86.vbcstnesh2ps128(i8* %A)

				define <8 x float> @test_int_x86_vbcstnesh2ps256(i8* %A) {
				; X64-LABEL: test_int_x86_vbcstnesh2ps256:
				; X64: # %bb.0:
				; X64-NEXT: vbcstnesh2ps (%rdi), %ymm0 # encoding: [0xc4,0xe2,0x7d,0xb1,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vbcstnesh2ps256:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vbcstnesh2ps (%eax), %ymm0 # encoding: [0xc4,0xe2,0x7d,0xb1,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <8 x float> @llvm.x86.vbcstnesh2ps256(i8* %A)
				ret <8 x float> %ret
				}
				declare <8 x float> @llvm.x86.vbcstnesh2ps256(i8* %A)

				define <4 x float> @test_int_x86_vcvtneebf162ps128(i8* %A) {
				; X64-LABEL: test_int_x86_vcvtneebf162ps128:
				; X64: # %bb.0:
				; X64-NEXT: vcvtneebf162ps (%rdi), %xmm0 # encoding: [0xc4,0xe2,0x7a,0xb0,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vcvtneebf162ps128:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vcvtneebf162ps (%eax), %xmm0 # encoding: [0xc4,0xe2,0x7a,0xb0,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <4 x float> @llvm.x86.vcvtneebf162ps128(i8* %A)
				ret <4 x float> %ret
				}
				declare <4 x float> @llvm.x86.vcvtneebf162ps128(i8* %A)

				define <8 x float> @test_int_x86_vcvtneebf162ps256(i8* %A) {
				; X64-LABEL: test_int_x86_vcvtneebf162ps256:
				; X64: # %bb.0:
				; X64-NEXT: vcvtneebf162ps (%rdi), %ymm0 # encoding: [0xc4,0xe2,0x7e,0xb0,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vcvtneebf162ps256:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vcvtneebf162ps (%eax), %ymm0 # encoding: [0xc4,0xe2,0x7e,0xb0,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <8 x float> @llvm.x86.vcvtneebf162ps256(i8* %A)
				ret <8 x float> %ret
				}
				declare <8 x float> @llvm.x86.vcvtneebf162ps256(i8* %A)

				define <4 x float> @test_int_x86_vcvtneeph2ps128(i8* %A) {
				; X64-LABEL: test_int_x86_vcvtneeph2ps128:
				; X64: # %bb.0:
				; X64-NEXT: vcvtneeph2ps (%rdi), %xmm0 # encoding: [0xc4,0xe2,0x79,0xb0,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vcvtneeph2ps128:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vcvtneeph2ps (%eax), %xmm0 # encoding: [0xc4,0xe2,0x79,0xb0,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <4 x float> @llvm.x86.vcvtneeph2ps128(i8* %A)
				ret <4 x float> %ret
				}
				declare <4 x float> @llvm.x86.vcvtneeph2ps128(i8* %A)

				define <8 x float> @test_int_x86_vcvtneeph2ps256(i8* %A) {
				; X64-LABEL: test_int_x86_vcvtneeph2ps256:
				; X64: # %bb.0:
				; X64-NEXT: vcvtneeph2ps (%rdi), %ymm0 # encoding: [0xc4,0xe2,0x7d,0xb0,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vcvtneeph2ps256:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vcvtneeph2ps (%eax), %ymm0 # encoding: [0xc4,0xe2,0x7d,0xb0,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <8 x float> @llvm.x86.vcvtneeph2ps256(i8* %A)
				ret <8 x float> %ret
				}
				declare <8 x float> @llvm.x86.vcvtneeph2ps256(i8* %A)

				define <4 x float> @test_int_x86_vcvtneobf162ps128(i8* %A) {
				; X64-LABEL: test_int_x86_vcvtneobf162ps128:
				; X64: # %bb.0:
				; X64-NEXT: vcvtneobf162ps (%rdi), %xmm0 # encoding: [0xc4,0xe2,0x7b,0xb0,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vcvtneobf162ps128:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vcvtneobf162ps (%eax), %xmm0 # encoding: [0xc4,0xe2,0x7b,0xb0,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <4 x float> @llvm.x86.vcvtneobf162ps128(i8* %A)
				ret <4 x float> %ret
				}
				declare <4 x float> @llvm.x86.vcvtneobf162ps128(i8* %A)

				define <8 x float> @test_int_x86_vcvtneobf162ps256(i8* %A) {
				; X64-LABEL: test_int_x86_vcvtneobf162ps256:
				; X64: # %bb.0:
				; X64-NEXT: vcvtneobf162ps (%rdi), %ymm0 # encoding: [0xc4,0xe2,0x7f,0xb0,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vcvtneobf162ps256:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vcvtneobf162ps (%eax), %ymm0 # encoding: [0xc4,0xe2,0x7f,0xb0,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <8 x float> @llvm.x86.vcvtneobf162ps256(i8* %A)
				ret <8 x float> %ret
				}
				declare <8 x float> @llvm.x86.vcvtneobf162ps256(i8* %A)

				define <4 x float> @test_int_x86_vcvtneoph2ps128(i8* %A) {
				; X64-LABEL: test_int_x86_vcvtneoph2ps128:
				; X64: # %bb.0:
				; X64-NEXT: vcvtneoph2ps (%rdi), %xmm0 # encoding: [0xc4,0xe2,0x78,0xb0,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vcvtneoph2ps128:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vcvtneoph2ps (%eax), %xmm0 # encoding: [0xc4,0xe2,0x78,0xb0,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <4 x float> @llvm.x86.vcvtneoph2ps128(i8* %A)
				ret <4 x float> %ret
				}
				declare <4 x float> @llvm.x86.vcvtneoph2ps128(i8* %A)

				define <8 x float> @test_int_x86_vcvtneoph2ps256(i8* %A) {
				; X64-LABEL: test_int_x86_vcvtneoph2ps256:
				; X64: # %bb.0:
				; X64-NEXT: vcvtneoph2ps (%rdi), %ymm0 # encoding: [0xc4,0xe2,0x7c,0xb0,0x07]
				; X64-NEXT: retq # encoding: [0xc3]
				;
				; X86-LABEL: test_int_x86_vcvtneoph2ps256:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax # encoding: [0x8b,0x44,0x24,0x04]
				; X86-NEXT: vcvtneoph2ps (%eax), %ymm0 # encoding: [0xc4,0xe2,0x7c,0xb0,0x00]
				; X86-NEXT: retl # encoding: [0xc3]
				%ret = call <8 x float> @llvm.x86.vcvtneoph2ps256(i8* %A)
				ret <8 x float> %ret
				}
				declare <8 x float> @llvm.x86.vcvtneoph2ps256(i8* %A)

				define <8 x bfloat> @test_int_x86_vcvtneps2bf16128(<4 x float> %A) {
				; CHECK-LABEL: test_int_x86_vcvtneps2bf16128:
				; CHECK: # %bb.0:
				; CHECK-NEXT: {vex} vcvtneps2bf16 %xmm0, %xmm0 # encoding: [0xc4,0xe2,0x7a,0x72,0xc0]
				; CHECK-NEXT: # kill: def $xmm1 killed $xmm0
				; CHECK-NEXT: ret{{[l\|q]}} # encoding: [0xc3]
				%ret = call <8 x bfloat> @llvm.x86.vcvtneps2bf16128(<4 x float> %A)
				ret <8 x bfloat> %ret
				}
				declare <8 x bfloat> @llvm.x86.vcvtneps2bf16128(<4 x float> %A)

				define <8 x bfloat> @test_int_x86_vcvtneps2bf16256(<8 x float> %A) {
				; CHECK-LABEL: test_int_x86_vcvtneps2bf16256:
				; CHECK: # %bb.0:
				; CHECK-NEXT: {vex} vcvtneps2bf16 %ymm0, %xmm0 # encoding: [0xc4,0xe2,0x7e,0x72,0xc0]
				; CHECK-NEXT: # kill: def $xmm1 killed $xmm0
				; CHECK-NEXT: vzeroupper # encoding: [0xc5,0xf8,0x77]
				; CHECK-NEXT: ret{{[l\|q]}} # encoding: [0xc3]
				%ret = call <8 x bfloat> @llvm.x86.vcvtneps2bf16256(<8 x float> %A)
				ret <8 x bfloat> %ret
				}
				declare <8 x bfloat> @llvm.x86.vcvtneps2bf16256(<8 x float> %A)

llvm/test/MC/Disassembler/X86/avx_ne_convert-32.txt

This file was added.

				# RUN: llvm-mc --disassemble %s -triple=i386-unknown-unknown \| FileCheck %s --check-prefixes=ATT
				# RUN: llvm-mc --disassemble %s -triple=i386-unknown-unknown -x86-asm-syntax=intel --output-asm-variant=1 \| FileCheck %s --check-prefixes=INTEL

				# ATT: vbcstnebf162ps 268435456(%esp,%esi,8), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7a,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vbcstnebf162ps 291(%edi,%eax,4), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7a,0xb1,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vbcstnebf162ps (%eax), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [eax]
				0xc4,0xe2,0x7a,0xb1,0x10

				# ATT: vbcstnebf162ps -64(,%ebp,2), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [2*ebp - 64]
				0xc4,0xe2,0x7a,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff

				# ATT: vbcstnebf162ps 254(%ecx), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [ecx + 254]
				0xc4,0xe2,0x7a,0xb1,0x91,0xfe,0x00,0x00,0x00

				# ATT: vbcstnebf162ps -256(%edx), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [edx - 256]
				0xc4,0xe2,0x7a,0xb1,0x92,0x00,0xff,0xff,0xff

				# ATT: vbcstnebf162ps 268435456(%esp,%esi,8), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7e,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vbcstnebf162ps 291(%edi,%eax,4), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7e,0xb1,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vbcstnebf162ps (%eax), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [eax]
				0xc4,0xe2,0x7e,0xb1,0x10

				# ATT: vbcstnebf162ps -64(,%ebp,2), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [2*ebp - 64]
				0xc4,0xe2,0x7e,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff

				# ATT: vbcstnebf162ps 254(%ecx), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [ecx + 254]
				0xc4,0xe2,0x7e,0xb1,0x91,0xfe,0x00,0x00,0x00

				# ATT: vbcstnebf162ps -256(%edx), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [edx - 256]
				0xc4,0xe2,0x7e,0xb1,0x92,0x00,0xff,0xff,0xff

				# ATT: vbcstnesh2ps 268435456(%esp,%esi,8), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x79,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vbcstnesh2ps 291(%edi,%eax,4), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x79,0xb1,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vbcstnesh2ps (%eax), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [eax]
				0xc4,0xe2,0x79,0xb1,0x10

				# ATT: vbcstnesh2ps -64(,%ebp,2), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [2*ebp - 64]
				0xc4,0xe2,0x79,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff

				# ATT: vbcstnesh2ps 254(%ecx), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [ecx + 254]
				0xc4,0xe2,0x79,0xb1,0x91,0xfe,0x00,0x00,0x00

				# ATT: vbcstnesh2ps -256(%edx), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [edx - 256]
				0xc4,0xe2,0x79,0xb1,0x92,0x00,0xff,0xff,0xff

				# ATT: vbcstnesh2ps 268435456(%esp,%esi,8), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7d,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vbcstnesh2ps 291(%edi,%eax,4), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7d,0xb1,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vbcstnesh2ps (%eax), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [eax]
				0xc4,0xe2,0x7d,0xb1,0x10

				# ATT: vbcstnesh2ps -64(,%ebp,2), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [2*ebp - 64]
				0xc4,0xe2,0x7d,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff

				# ATT: vbcstnesh2ps 254(%ecx), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [ecx + 254]
				0xc4,0xe2,0x7d,0xb1,0x91,0xfe,0x00,0x00,0x00

				# ATT: vbcstnesh2ps -256(%edx), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [edx - 256]
				0xc4,0xe2,0x7d,0xb1,0x92,0x00,0xff,0xff,0xff

				# ATT: vcvtneebf162ps 268435456(%esp,%esi,8), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7a,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vcvtneebf162ps 291(%edi,%eax,4), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7a,0xb0,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vcvtneebf162ps (%eax), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [eax]
				0xc4,0xe2,0x7a,0xb0,0x10

				# ATT: vcvtneebf162ps -512(,%ebp,2), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [2*ebp - 512]
				0xc4,0xe2,0x7a,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: vcvtneebf162ps 2032(%ecx), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [ecx + 2032]
				0xc4,0xe2,0x7a,0xb0,0x91,0xf0,0x07,0x00,0x00

				# ATT: vcvtneebf162ps -2048(%edx), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [edx - 2048]
				0xc4,0xe2,0x7a,0xb0,0x92,0x00,0xf8,0xff,0xff

				# ATT: vcvtneebf162ps 268435456(%esp,%esi,8), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7e,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vcvtneebf162ps 291(%edi,%eax,4), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7e,0xb0,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vcvtneebf162ps (%eax), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [eax]
				0xc4,0xe2,0x7e,0xb0,0x10

				# ATT: vcvtneebf162ps -1024(,%ebp,2), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [2*ebp - 1024]
				0xc4,0xe2,0x7e,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: vcvtneebf162ps 4064(%ecx), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [ecx + 4064]
				0xc4,0xe2,0x7e,0xb0,0x91,0xe0,0x0f,0x00,0x00

				# ATT: vcvtneebf162ps -4096(%edx), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [edx - 4096]
				0xc4,0xe2,0x7e,0xb0,0x92,0x00,0xf0,0xff,0xff

				# ATT: vcvtneeph2ps 268435456(%esp,%esi,8), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x79,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vcvtneeph2ps 291(%edi,%eax,4), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x79,0xb0,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vcvtneeph2ps (%eax), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [eax]
				0xc4,0xe2,0x79,0xb0,0x10

				# ATT: vcvtneeph2ps -512(,%ebp,2), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [2*ebp - 512]
				0xc4,0xe2,0x79,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: vcvtneeph2ps 2032(%ecx), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [ecx + 2032]
				0xc4,0xe2,0x79,0xb0,0x91,0xf0,0x07,0x00,0x00

				# ATT: vcvtneeph2ps -2048(%edx), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [edx - 2048]
				0xc4,0xe2,0x79,0xb0,0x92,0x00,0xf8,0xff,0xff

				# ATT: vcvtneeph2ps 268435456(%esp,%esi,8), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7d,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vcvtneeph2ps 291(%edi,%eax,4), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7d,0xb0,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vcvtneeph2ps (%eax), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [eax]
				0xc4,0xe2,0x7d,0xb0,0x10

				# ATT: vcvtneeph2ps -1024(,%ebp,2), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [2*ebp - 1024]
				0xc4,0xe2,0x7d,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: vcvtneeph2ps 4064(%ecx), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [ecx + 4064]
				0xc4,0xe2,0x7d,0xb0,0x91,0xe0,0x0f,0x00,0x00

				# ATT: vcvtneeph2ps -4096(%edx), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [edx - 4096]
				0xc4,0xe2,0x7d,0xb0,0x92,0x00,0xf0,0xff,0xff

				# ATT: vcvtneobf162ps 268435456(%esp,%esi,8), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7b,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vcvtneobf162ps 291(%edi,%eax,4), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7b,0xb0,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vcvtneobf162ps (%eax), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [eax]
				0xc4,0xe2,0x7b,0xb0,0x10

				# ATT: vcvtneobf162ps -512(,%ebp,2), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [2*ebp - 512]
				0xc4,0xe2,0x7b,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: vcvtneobf162ps 2032(%ecx), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [ecx + 2032]
				0xc4,0xe2,0x7b,0xb0,0x91,0xf0,0x07,0x00,0x00

				# ATT: vcvtneobf162ps -2048(%edx), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [edx - 2048]
				0xc4,0xe2,0x7b,0xb0,0x92,0x00,0xf8,0xff,0xff

				# ATT: vcvtneobf162ps 268435456(%esp,%esi,8), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7f,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vcvtneobf162ps 291(%edi,%eax,4), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7f,0xb0,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vcvtneobf162ps (%eax), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [eax]
				0xc4,0xe2,0x7f,0xb0,0x10

				# ATT: vcvtneobf162ps -1024(,%ebp,2), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [2*ebp - 1024]
				0xc4,0xe2,0x7f,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: vcvtneobf162ps 4064(%ecx), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [ecx + 4064]
				0xc4,0xe2,0x7f,0xb0,0x91,0xe0,0x0f,0x00,0x00

				# ATT: vcvtneobf162ps -4096(%edx), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [edx - 4096]
				0xc4,0xe2,0x7f,0xb0,0x92,0x00,0xf0,0xff,0xff

				# ATT: vcvtneoph2ps 268435456(%esp,%esi,8), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x78,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vcvtneoph2ps 291(%edi,%eax,4), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x78,0xb0,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vcvtneoph2ps (%eax), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [eax]
				0xc4,0xe2,0x78,0xb0,0x10

				# ATT: vcvtneoph2ps -512(,%ebp,2), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [2*ebp - 512]
				0xc4,0xe2,0x78,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: vcvtneoph2ps 2032(%ecx), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [ecx + 2032]
				0xc4,0xe2,0x78,0xb0,0x91,0xf0,0x07,0x00,0x00

				# ATT: vcvtneoph2ps -2048(%edx), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [edx - 2048]
				0xc4,0xe2,0x78,0xb0,0x92,0x00,0xf8,0xff,0xff

				# ATT: vcvtneoph2ps 268435456(%esp,%esi,8), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7c,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: vcvtneoph2ps 291(%edi,%eax,4), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7c,0xb0,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: vcvtneoph2ps (%eax), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [eax]
				0xc4,0xe2,0x7c,0xb0,0x10

				# ATT: vcvtneoph2ps -1024(,%ebp,2), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [2*ebp - 1024]
				0xc4,0xe2,0x7c,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: vcvtneoph2ps 4064(%ecx), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [ecx + 4064]
				0xc4,0xe2,0x7c,0xb0,0x91,0xe0,0x0f,0x00,0x00

				# ATT: vcvtneoph2ps -4096(%edx), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [edx - 4096]
				0xc4,0xe2,0x7c,0xb0,0x92,0x00,0xf0,0xff,0xff

				# ATT: {vex} vcvtneps2bf16 %xmm3, %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmm3
				0xc4,0xe2,0x7a,0x72,0xd3

				# ATT: {vex} vcvtneps2bf16 %ymm3, %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, ymm3
				0xc4,0xe2,0x7e,0x72,0xd3

				# ATT: {vex} vcvtneps2bf16x 268435456(%esp,%esi,8), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [esp + 8*esi + 268435456]
				0xc4,0xe2,0x7a,0x72,0x94,0xf4,0x00,0x00,0x00,0x10

				# ATT: {vex} vcvtneps2bf16x 291(%edi,%eax,4), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [edi + 4*eax + 291]
				0xc4,0xe2,0x7a,0x72,0x94,0x87,0x23,0x01,0x00,0x00

				# ATT: {vex} vcvtneps2bf16x (%eax), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [eax]
				0xc4,0xe2,0x7a,0x72,0x10

				# ATT: {vex} vcvtneps2bf16x -512(,%ebp,2), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [2*ebp - 512]
				0xc4,0xe2,0x7a,0x72,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: {vex} vcvtneps2bf16x 2032(%ecx), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [ecx + 2032]
				0xc4,0xe2,0x7a,0x72,0x91,0xf0,0x07,0x00,0x00

				# ATT: {vex} vcvtneps2bf16x -2048(%edx), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [edx - 2048]
				0xc4,0xe2,0x7a,0x72,0x92,0x00,0xf8,0xff,0xff

				# ATT: {vex} vcvtneps2bf16y -1024(,%ebp,2), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, ymmword ptr [2*ebp - 1024]
				0xc4,0xe2,0x7e,0x72,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: {vex} vcvtneps2bf16y 4064(%ecx), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, ymmword ptr [ecx + 4064]
				0xc4,0xe2,0x7e,0x72,0x91,0xe0,0x0f,0x00,0x00

				# ATT: {vex} vcvtneps2bf16y -4096(%edx), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, ymmword ptr [edx - 4096]
				0xc4,0xe2,0x7e,0x72,0x92,0x00,0xf0,0xff,0xff

llvm/test/MC/Disassembler/X86/avx_ne_convert-64.txt

This file was added.

				# RUN: llvm-mc --disassemble %s -triple=x86_64 \| FileCheck %s --check-prefixes=ATT
				# RUN: llvm-mc --disassemble %s -triple=x86_64 -x86-asm-syntax=intel --output-asm-variant=1 \| FileCheck %s --check-prefixes=INTEL

				# ATT: vbcstnebf162ps 268435456(%rbp,%r14,8), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7a,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vbcstnebf162ps 291(%r8,%rax,4), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7a,0xb1,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vbcstnebf162ps (%rip), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [rip]
				0xc4,0xe2,0x7a,0xb1,0x15,0x00,0x00,0x00,0x00

				# ATT: vbcstnebf162ps -64(,%rbp,2), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [2*rbp - 64]
				0xc4,0xe2,0x7a,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff

				# ATT: vbcstnebf162ps 254(%rcx), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [rcx + 254]
				0xc4,0xe2,0x7a,0xb1,0x91,0xfe,0x00,0x00,0x00

				# ATT: vbcstnebf162ps -256(%rdx), %xmm2
				# INTEL: vbcstnebf162ps xmm2, word ptr [rdx - 256]
				0xc4,0xe2,0x7a,0xb1,0x92,0x00,0xff,0xff,0xff

				# ATT: vbcstnebf162ps 268435456(%rbp,%r14,8), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7e,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vbcstnebf162ps 291(%r8,%rax,4), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7e,0xb1,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vbcstnebf162ps (%rip), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [rip]
				0xc4,0xe2,0x7e,0xb1,0x15,0x00,0x00,0x00,0x00

				# ATT: vbcstnebf162ps -64(,%rbp,2), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [2*rbp - 64]
				0xc4,0xe2,0x7e,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff

				# ATT: vbcstnebf162ps 254(%rcx), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [rcx + 254]
				0xc4,0xe2,0x7e,0xb1,0x91,0xfe,0x00,0x00,0x00

				# ATT: vbcstnebf162ps -256(%rdx), %ymm2
				# INTEL: vbcstnebf162ps ymm2, word ptr [rdx - 256]
				0xc4,0xe2,0x7e,0xb1,0x92,0x00,0xff,0xff,0xff

				# ATT: vbcstnesh2ps 268435456(%rbp,%r14,8), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x79,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vbcstnesh2ps 291(%r8,%rax,4), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x79,0xb1,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vbcstnesh2ps (%rip), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [rip]
				0xc4,0xe2,0x79,0xb1,0x15,0x00,0x00,0x00,0x00

				# ATT: vbcstnesh2ps -64(,%rbp,2), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [2*rbp - 64]
				0xc4,0xe2,0x79,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff

				# ATT: vbcstnesh2ps 254(%rcx), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [rcx + 254]
				0xc4,0xe2,0x79,0xb1,0x91,0xfe,0x00,0x00,0x00

				# ATT: vbcstnesh2ps -256(%rdx), %xmm2
				# INTEL: vbcstnesh2ps xmm2, word ptr [rdx - 256]
				0xc4,0xe2,0x79,0xb1,0x92,0x00,0xff,0xff,0xff

				# ATT: vbcstnesh2ps 268435456(%rbp,%r14,8), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7d,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vbcstnesh2ps 291(%r8,%rax,4), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7d,0xb1,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vbcstnesh2ps (%rip), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [rip]
				0xc4,0xe2,0x7d,0xb1,0x15,0x00,0x00,0x00,0x00

				# ATT: vbcstnesh2ps -64(,%rbp,2), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [2*rbp - 64]
				0xc4,0xe2,0x7d,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff

				# ATT: vbcstnesh2ps 254(%rcx), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [rcx + 254]
				0xc4,0xe2,0x7d,0xb1,0x91,0xfe,0x00,0x00,0x00

				# ATT: vbcstnesh2ps -256(%rdx), %ymm2
				# INTEL: vbcstnesh2ps ymm2, word ptr [rdx - 256]
				0xc4,0xe2,0x7d,0xb1,0x92,0x00,0xff,0xff,0xff

				# ATT: vcvtneebf162ps 268435456(%rbp,%r14,8), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7a,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vcvtneebf162ps 291(%r8,%rax,4), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7a,0xb0,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vcvtneebf162ps (%rip), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [rip]
				0xc4,0xe2,0x7a,0xb0,0x15,0x00,0x00,0x00,0x00

				# ATT: vcvtneebf162ps -512(,%rbp,2), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [2*rbp - 512]
				0xc4,0xe2,0x7a,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: vcvtneebf162ps 2032(%rcx), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [rcx + 2032]
				0xc4,0xe2,0x7a,0xb0,0x91,0xf0,0x07,0x00,0x00

				# ATT: vcvtneebf162ps -2048(%rdx), %xmm2
				# INTEL: vcvtneebf162ps xmm2, xmmword ptr [rdx - 2048]
				0xc4,0xe2,0x7a,0xb0,0x92,0x00,0xf8,0xff,0xff

				# ATT: vcvtneebf162ps 268435456(%rbp,%r14,8), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7e,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vcvtneebf162ps 291(%r8,%rax,4), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7e,0xb0,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vcvtneebf162ps (%rip), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [rip]
				0xc4,0xe2,0x7e,0xb0,0x15,0x00,0x00,0x00,0x00

				# ATT: vcvtneebf162ps -1024(,%rbp,2), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [2*rbp - 1024]
				0xc4,0xe2,0x7e,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: vcvtneebf162ps 4064(%rcx), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [rcx + 4064]
				0xc4,0xe2,0x7e,0xb0,0x91,0xe0,0x0f,0x00,0x00

				# ATT: vcvtneebf162ps -4096(%rdx), %ymm2
				# INTEL: vcvtneebf162ps ymm2, ymmword ptr [rdx - 4096]
				0xc4,0xe2,0x7e,0xb0,0x92,0x00,0xf0,0xff,0xff

				# ATT: vcvtneeph2ps 268435456(%rbp,%r14,8), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x79,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vcvtneeph2ps 291(%r8,%rax,4), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x79,0xb0,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vcvtneeph2ps (%rip), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [rip]
				0xc4,0xe2,0x79,0xb0,0x15,0x00,0x00,0x00,0x00

				# ATT: vcvtneeph2ps -512(,%rbp,2), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [2*rbp - 512]
				0xc4,0xe2,0x79,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: vcvtneeph2ps 2032(%rcx), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [rcx + 2032]
				0xc4,0xe2,0x79,0xb0,0x91,0xf0,0x07,0x00,0x00

				# ATT: vcvtneeph2ps -2048(%rdx), %xmm2
				# INTEL: vcvtneeph2ps xmm2, xmmword ptr [rdx - 2048]
				0xc4,0xe2,0x79,0xb0,0x92,0x00,0xf8,0xff,0xff

				# ATT: vcvtneeph2ps 268435456(%rbp,%r14,8), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7d,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vcvtneeph2ps 291(%r8,%rax,4), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7d,0xb0,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vcvtneeph2ps (%rip), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [rip]
				0xc4,0xe2,0x7d,0xb0,0x15,0x00,0x00,0x00,0x00

				# ATT: vcvtneeph2ps -1024(,%rbp,2), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [2*rbp - 1024]
				0xc4,0xe2,0x7d,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: vcvtneeph2ps 4064(%rcx), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [rcx + 4064]
				0xc4,0xe2,0x7d,0xb0,0x91,0xe0,0x0f,0x00,0x00

				# ATT: vcvtneeph2ps -4096(%rdx), %ymm2
				# INTEL: vcvtneeph2ps ymm2, ymmword ptr [rdx - 4096]
				0xc4,0xe2,0x7d,0xb0,0x92,0x00,0xf0,0xff,0xff

				# ATT: vcvtneobf162ps 268435456(%rbp,%r14,8), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7b,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vcvtneobf162ps 291(%r8,%rax,4), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7b,0xb0,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vcvtneobf162ps (%rip), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [rip]
				0xc4,0xe2,0x7b,0xb0,0x15,0x00,0x00,0x00,0x00

				# ATT: vcvtneobf162ps -512(,%rbp,2), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [2*rbp - 512]
				0xc4,0xe2,0x7b,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: vcvtneobf162ps 2032(%rcx), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [rcx + 2032]
				0xc4,0xe2,0x7b,0xb0,0x91,0xf0,0x07,0x00,0x00

				# ATT: vcvtneobf162ps -2048(%rdx), %xmm2
				# INTEL: vcvtneobf162ps xmm2, xmmword ptr [rdx - 2048]
				0xc4,0xe2,0x7b,0xb0,0x92,0x00,0xf8,0xff,0xff

				# ATT: vcvtneobf162ps 268435456(%rbp,%r14,8), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7f,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vcvtneobf162ps 291(%r8,%rax,4), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7f,0xb0,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vcvtneobf162ps (%rip), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [rip]
				0xc4,0xe2,0x7f,0xb0,0x15,0x00,0x00,0x00,0x00

				# ATT: vcvtneobf162ps -1024(,%rbp,2), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [2*rbp - 1024]
				0xc4,0xe2,0x7f,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: vcvtneobf162ps 4064(%rcx), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [rcx + 4064]
				0xc4,0xe2,0x7f,0xb0,0x91,0xe0,0x0f,0x00,0x00

				# ATT: vcvtneobf162ps -4096(%rdx), %ymm2
				# INTEL: vcvtneobf162ps ymm2, ymmword ptr [rdx - 4096]
				0xc4,0xe2,0x7f,0xb0,0x92,0x00,0xf0,0xff,0xff

				# ATT: vcvtneoph2ps 268435456(%rbp,%r14,8), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x78,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vcvtneoph2ps 291(%r8,%rax,4), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x78,0xb0,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vcvtneoph2ps (%rip), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [rip]
				0xc4,0xe2,0x78,0xb0,0x15,0x00,0x00,0x00,0x00

				# ATT: vcvtneoph2ps -512(,%rbp,2), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [2*rbp - 512]
				0xc4,0xe2,0x78,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: vcvtneoph2ps 2032(%rcx), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [rcx + 2032]
				0xc4,0xe2,0x78,0xb0,0x91,0xf0,0x07,0x00,0x00

				# ATT: vcvtneoph2ps -2048(%rdx), %xmm2
				# INTEL: vcvtneoph2ps xmm2, xmmword ptr [rdx - 2048]
				0xc4,0xe2,0x78,0xb0,0x92,0x00,0xf8,0xff,0xff

				# ATT: vcvtneoph2ps 268435456(%rbp,%r14,8), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7c,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: vcvtneoph2ps 291(%r8,%rax,4), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7c,0xb0,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: vcvtneoph2ps (%rip), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [rip]
				0xc4,0xe2,0x7c,0xb0,0x15,0x00,0x00,0x00,0x00

				# ATT: vcvtneoph2ps -1024(,%rbp,2), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [2*rbp - 1024]
				0xc4,0xe2,0x7c,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: vcvtneoph2ps 4064(%rcx), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [rcx + 4064]
				0xc4,0xe2,0x7c,0xb0,0x91,0xe0,0x0f,0x00,0x00

				# ATT: vcvtneoph2ps -4096(%rdx), %ymm2
				# INTEL: vcvtneoph2ps ymm2, ymmword ptr [rdx - 4096]
				0xc4,0xe2,0x7c,0xb0,0x92,0x00,0xf0,0xff,0xff

				# ATT: {vex} vcvtneps2bf16 %xmm3, %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmm3
				0xc4,0xe2,0x7a,0x72,0xd3

				# ATT: {vex} vcvtneps2bf16 %ymm3, %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, ymm3
				0xc4,0xe2,0x7e,0x72,0xd3

				# ATT: {vex} vcvtneps2bf16x 268435456(%rbp,%r14,8), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				0xc4,0xa2,0x7a,0x72,0x94,0xf5,0x00,0x00,0x00,0x10

				# ATT: {vex} vcvtneps2bf16x 291(%r8,%rax,4), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [r8 + 4*rax + 291]
				0xc4,0xc2,0x7a,0x72,0x94,0x80,0x23,0x01,0x00,0x00

				# ATT: {vex} vcvtneps2bf16x (%rip), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [rip]
				0xc4,0xe2,0x7a,0x72,0x15,0x00,0x00,0x00,0x00

				# ATT: {vex} vcvtneps2bf16x -512(,%rbp,2), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [2*rbp - 512]
				0xc4,0xe2,0x7a,0x72,0x14,0x6d,0x00,0xfe,0xff,0xff

				# ATT: {vex} vcvtneps2bf16x 2032(%rcx), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [rcx + 2032]
				0xc4,0xe2,0x7a,0x72,0x91,0xf0,0x07,0x00,0x00

				# ATT: {vex} vcvtneps2bf16x -2048(%rdx), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, xmmword ptr [rdx - 2048]
				0xc4,0xe2,0x7a,0x72,0x92,0x00,0xf8,0xff,0xff

				# ATT: {vex} vcvtneps2bf16y -1024(,%rbp,2), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, ymmword ptr [2*rbp - 1024]
				0xc4,0xe2,0x7e,0x72,0x14,0x6d,0x00,0xfc,0xff,0xff

				# ATT: {vex} vcvtneps2bf16y 4064(%rcx), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, ymmword ptr [rcx + 4064]
				0xc4,0xe2,0x7e,0x72,0x91,0xe0,0x0f,0x00,0x00

				# ATT: {vex} vcvtneps2bf16y -4096(%rdx), %xmm2
				# INTEL: {vex} vcvtneps2bf16 xmm2, ymmword ptr [rdx - 4096]
				0xc4,0xe2,0x7e,0x72,0x92,0x00,0xf0,0xff,0xff

llvm/test/MC/X86/avx_ne_convert-32-att.s

This file was added.

				// RUN: llvm-mc -triple i686-unknown-unknown --show-encoding %s \| FileCheck %s

				// CHECK: vbcstnebf162ps 268435456(%esp,%esi,8), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10]
				vbcstnebf162ps 268435456(%esp,%esi,8), %xmm2

				// CHECK: vbcstnebf162ps 291(%edi,%eax,4), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x94,0x87,0x23,0x01,0x00,0x00]
				vbcstnebf162ps 291(%edi,%eax,4), %xmm2

				// CHECK: vbcstnebf162ps (%eax), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x10]
				vbcstnebf162ps (%eax), %xmm2

				// CHECK: vbcstnebf162ps -64(,%ebp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnebf162ps -64(,%ebp,2), %xmm2

				// CHECK: vbcstnebf162ps 254(%ecx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnebf162ps 254(%ecx), %xmm2

				// CHECK: vbcstnebf162ps -256(%edx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnebf162ps -256(%edx), %xmm2

				// CHECK: vbcstnebf162ps 268435456(%esp,%esi,8), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10]
				vbcstnebf162ps 268435456(%esp,%esi,8), %ymm2

				// CHECK: vbcstnebf162ps 291(%edi,%eax,4), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x94,0x87,0x23,0x01,0x00,0x00]
				vbcstnebf162ps 291(%edi,%eax,4), %ymm2

				// CHECK: vbcstnebf162ps (%eax), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x10]
				vbcstnebf162ps (%eax), %ymm2

				// CHECK: vbcstnebf162ps -64(,%ebp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnebf162ps -64(,%ebp,2), %ymm2

				// CHECK: vbcstnebf162ps 254(%ecx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnebf162ps 254(%ecx), %ymm2

				// CHECK: vbcstnebf162ps -256(%edx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnebf162ps -256(%edx), %ymm2

				// CHECK: vbcstnesh2ps 268435456(%esp,%esi,8), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10]
				vbcstnesh2ps 268435456(%esp,%esi,8), %xmm2

				// CHECK: vbcstnesh2ps 291(%edi,%eax,4), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x94,0x87,0x23,0x01,0x00,0x00]
				vbcstnesh2ps 291(%edi,%eax,4), %xmm2

				// CHECK: vbcstnesh2ps (%eax), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x10]
				vbcstnesh2ps (%eax), %xmm2

				// CHECK: vbcstnesh2ps -64(,%ebp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnesh2ps -64(,%ebp,2), %xmm2

				// CHECK: vbcstnesh2ps 254(%ecx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnesh2ps 254(%ecx), %xmm2

				// CHECK: vbcstnesh2ps -256(%edx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnesh2ps -256(%edx), %xmm2

				// CHECK: vbcstnesh2ps 268435456(%esp,%esi,8), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10]
				vbcstnesh2ps 268435456(%esp,%esi,8), %ymm2

				// CHECK: vbcstnesh2ps 291(%edi,%eax,4), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x94,0x87,0x23,0x01,0x00,0x00]
				vbcstnesh2ps 291(%edi,%eax,4), %ymm2

				// CHECK: vbcstnesh2ps (%eax), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x10]
				vbcstnesh2ps (%eax), %ymm2

				// CHECK: vbcstnesh2ps -64(,%ebp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnesh2ps -64(,%ebp,2), %ymm2

				// CHECK: vbcstnesh2ps 254(%ecx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnesh2ps 254(%ecx), %ymm2

				// CHECK: vbcstnesh2ps -256(%edx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnesh2ps -256(%edx), %ymm2

				// CHECK: vcvtneebf162ps 268435456(%esp,%esi,8), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneebf162ps 268435456(%esp,%esi,8), %xmm2

				// CHECK: vcvtneebf162ps 291(%edi,%eax,4), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneebf162ps 291(%edi,%eax,4), %xmm2

				// CHECK: vcvtneebf162ps (%eax), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x10]
				vcvtneebf162ps (%eax), %xmm2

				// CHECK: vcvtneebf162ps -512(,%ebp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneebf162ps -512(,%ebp,2), %xmm2

				// CHECK: vcvtneebf162ps 2032(%ecx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneebf162ps 2032(%ecx), %xmm2

				// CHECK: vcvtneebf162ps -2048(%edx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneebf162ps -2048(%edx), %xmm2

				// CHECK: vcvtneebf162ps 268435456(%esp,%esi,8), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneebf162ps 268435456(%esp,%esi,8), %ymm2

				// CHECK: vcvtneebf162ps 291(%edi,%eax,4), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneebf162ps 291(%edi,%eax,4), %ymm2

				// CHECK: vcvtneebf162ps (%eax), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x10]
				vcvtneebf162ps (%eax), %ymm2

				// CHECK: vcvtneebf162ps -1024(,%ebp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneebf162ps -1024(,%ebp,2), %ymm2

				// CHECK: vcvtneebf162ps 4064(%ecx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneebf162ps 4064(%ecx), %ymm2

				// CHECK: vcvtneebf162ps -4096(%edx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneebf162ps -4096(%edx), %ymm2

				// CHECK: vcvtneeph2ps 268435456(%esp,%esi,8), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneeph2ps 268435456(%esp,%esi,8), %xmm2

				// CHECK: vcvtneeph2ps 291(%edi,%eax,4), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneeph2ps 291(%edi,%eax,4), %xmm2

				// CHECK: vcvtneeph2ps (%eax), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x10]
				vcvtneeph2ps (%eax), %xmm2

				// CHECK: vcvtneeph2ps -512(,%ebp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneeph2ps -512(,%ebp,2), %xmm2

				// CHECK: vcvtneeph2ps 2032(%ecx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneeph2ps 2032(%ecx), %xmm2

				// CHECK: vcvtneeph2ps -2048(%edx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneeph2ps -2048(%edx), %xmm2

				// CHECK: vcvtneeph2ps 268435456(%esp,%esi,8), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneeph2ps 268435456(%esp,%esi,8), %ymm2

				// CHECK: vcvtneeph2ps 291(%edi,%eax,4), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneeph2ps 291(%edi,%eax,4), %ymm2

				// CHECK: vcvtneeph2ps (%eax), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x10]
				vcvtneeph2ps (%eax), %ymm2

				// CHECK: vcvtneeph2ps -1024(,%ebp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneeph2ps -1024(,%ebp,2), %ymm2

				// CHECK: vcvtneeph2ps 4064(%ecx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneeph2ps 4064(%ecx), %ymm2

				// CHECK: vcvtneeph2ps -4096(%edx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneeph2ps -4096(%edx), %ymm2

				// CHECK: vcvtneobf162ps 268435456(%esp,%esi,8), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneobf162ps 268435456(%esp,%esi,8), %xmm2

				// CHECK: vcvtneobf162ps 291(%edi,%eax,4), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneobf162ps 291(%edi,%eax,4), %xmm2

				// CHECK: vcvtneobf162ps (%eax), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x10]
				vcvtneobf162ps (%eax), %xmm2

				// CHECK: vcvtneobf162ps -512(,%ebp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneobf162ps -512(,%ebp,2), %xmm2

				// CHECK: vcvtneobf162ps 2032(%ecx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneobf162ps 2032(%ecx), %xmm2

				// CHECK: vcvtneobf162ps -2048(%edx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneobf162ps -2048(%edx), %xmm2

				// CHECK: vcvtneobf162ps 268435456(%esp,%esi,8), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneobf162ps 268435456(%esp,%esi,8), %ymm2

				// CHECK: vcvtneobf162ps 291(%edi,%eax,4), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneobf162ps 291(%edi,%eax,4), %ymm2

				// CHECK: vcvtneobf162ps (%eax), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x10]
				vcvtneobf162ps (%eax), %ymm2

				// CHECK: vcvtneobf162ps -1024(,%ebp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneobf162ps -1024(,%ebp,2), %ymm2

				// CHECK: vcvtneobf162ps 4064(%ecx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneobf162ps 4064(%ecx), %ymm2

				// CHECK: vcvtneobf162ps -4096(%edx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneobf162ps -4096(%edx), %ymm2

				// CHECK: vcvtneoph2ps 268435456(%esp,%esi,8), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneoph2ps 268435456(%esp,%esi,8), %xmm2

				// CHECK: vcvtneoph2ps 291(%edi,%eax,4), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneoph2ps 291(%edi,%eax,4), %xmm2

				// CHECK: vcvtneoph2ps (%eax), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x10]
				vcvtneoph2ps (%eax), %xmm2

				// CHECK: vcvtneoph2ps -512(,%ebp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneoph2ps -512(,%ebp,2), %xmm2

				// CHECK: vcvtneoph2ps 2032(%ecx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneoph2ps 2032(%ecx), %xmm2

				// CHECK: vcvtneoph2ps -2048(%edx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneoph2ps -2048(%edx), %xmm2

				// CHECK: vcvtneoph2ps 268435456(%esp,%esi,8), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneoph2ps 268435456(%esp,%esi,8), %ymm2

				// CHECK: vcvtneoph2ps 291(%edi,%eax,4), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneoph2ps 291(%edi,%eax,4), %ymm2

				// CHECK: vcvtneoph2ps (%eax), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x10]
				vcvtneoph2ps (%eax), %ymm2

				// CHECK: vcvtneoph2ps -1024(,%ebp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneoph2ps -1024(,%ebp,2), %ymm2

				// CHECK: vcvtneoph2ps 4064(%ecx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneoph2ps 4064(%ecx), %ymm2

				// CHECK: vcvtneoph2ps -4096(%edx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneoph2ps -4096(%edx), %ymm2

				// CHECK: {vex} vcvtneps2bf16 %xmm3, %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0xd3]
				{vex} vcvtneps2bf16 %xmm3, %xmm2

				// CHECK: {vex} vcvtneps2bf16 %ymm3, %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0xd3]
				{vex} vcvtneps2bf16 %ymm3, %xmm2

				// CHECK: {vex} vcvtneps2bf16x 268435456(%esp,%esi,8), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x94,0xf4,0x00,0x00,0x00,0x10]
				{vex} vcvtneps2bf16x 268435456(%esp,%esi,8), %xmm2

				// CHECK: {vex} vcvtneps2bf16x 291(%edi,%eax,4), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x94,0x87,0x23,0x01,0x00,0x00]
				{vex} vcvtneps2bf16x 291(%edi,%eax,4), %xmm2

				// CHECK: {vex} vcvtneps2bf16x (%eax), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x10]
				{vex} vcvtneps2bf16x (%eax), %xmm2

				// CHECK: {vex} vcvtneps2bf16x -512(,%ebp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x14,0x6d,0x00,0xfe,0xff,0xff]
				{vex} vcvtneps2bf16x -512(,%ebp,2), %xmm2

				// CHECK: {vex} vcvtneps2bf16x 2032(%ecx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x91,0xf0,0x07,0x00,0x00]
				{vex} vcvtneps2bf16x 2032(%ecx), %xmm2

				// CHECK: {vex} vcvtneps2bf16x -2048(%edx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x92,0x00,0xf8,0xff,0xff]
				{vex} vcvtneps2bf16x -2048(%edx), %xmm2

				// CHECK: {vex} vcvtneps2bf16y -1024(,%ebp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x14,0x6d,0x00,0xfc,0xff,0xff]
				{vex} vcvtneps2bf16y -1024(,%ebp,2), %xmm2

				// CHECK: {vex} vcvtneps2bf16y 4064(%ecx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x91,0xe0,0x0f,0x00,0x00]
				{vex} vcvtneps2bf16y 4064(%ecx), %xmm2

				// CHECK: {vex} vcvtneps2bf16y -4096(%edx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x92,0x00,0xf0,0xff,0xff]
				{vex} vcvtneps2bf16y -4096(%edx), %xmm2

llvm/test/MC/X86/avx_ne_convert-32-intel.s

This file was added.

				// RUN: llvm-mc -triple i686-unknown-unknown -x86-asm-syntax=intel -output-asm-variant=1 --show-encoding %s \| FileCheck %s

				// CHECK: vbcstnebf162ps xmm2, word ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10]
				vbcstnebf162ps xmm2, word ptr [esp + 8*esi + 268435456]

				// CHECK: vbcstnebf162ps xmm2, word ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x94,0x87,0x23,0x01,0x00,0x00]
				vbcstnebf162ps xmm2, word ptr [edi + 4*eax + 291]

				// CHECK: vbcstnebf162ps xmm2, word ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x10]
				vbcstnebf162ps xmm2, word ptr [eax]

				// CHECK: vbcstnebf162ps xmm2, word ptr [2*ebp - 64]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnebf162ps xmm2, word ptr [2*ebp - 64]

				// CHECK: vbcstnebf162ps xmm2, word ptr [ecx + 254]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnebf162ps xmm2, word ptr [ecx + 254]

				// CHECK: vbcstnebf162ps xmm2, word ptr [edx - 256]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnebf162ps xmm2, word ptr [edx - 256]

				// CHECK: vbcstnebf162ps ymm2, word ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10]
				vbcstnebf162ps ymm2, word ptr [esp + 8*esi + 268435456]

				// CHECK: vbcstnebf162ps ymm2, word ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x94,0x87,0x23,0x01,0x00,0x00]
				vbcstnebf162ps ymm2, word ptr [edi + 4*eax + 291]

				// CHECK: vbcstnebf162ps ymm2, word ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x10]
				vbcstnebf162ps ymm2, word ptr [eax]

				// CHECK: vbcstnebf162ps ymm2, word ptr [2*ebp - 64]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnebf162ps ymm2, word ptr [2*ebp - 64]

				// CHECK: vbcstnebf162ps ymm2, word ptr [ecx + 254]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnebf162ps ymm2, word ptr [ecx + 254]

				// CHECK: vbcstnebf162ps ymm2, word ptr [edx - 256]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnebf162ps ymm2, word ptr [edx - 256]

				// CHECK: vbcstnesh2ps xmm2, word ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10]
				vbcstnesh2ps xmm2, word ptr [esp + 8*esi + 268435456]

				// CHECK: vbcstnesh2ps xmm2, word ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x94,0x87,0x23,0x01,0x00,0x00]
				vbcstnesh2ps xmm2, word ptr [edi + 4*eax + 291]

				// CHECK: vbcstnesh2ps xmm2, word ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x10]
				vbcstnesh2ps xmm2, word ptr [eax]

				// CHECK: vbcstnesh2ps xmm2, word ptr [2*ebp - 64]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnesh2ps xmm2, word ptr [2*ebp - 64]

				// CHECK: vbcstnesh2ps xmm2, word ptr [ecx + 254]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnesh2ps xmm2, word ptr [ecx + 254]

				// CHECK: vbcstnesh2ps xmm2, word ptr [edx - 256]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnesh2ps xmm2, word ptr [edx - 256]

				// CHECK: vbcstnesh2ps ymm2, word ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x94,0xf4,0x00,0x00,0x00,0x10]
				vbcstnesh2ps ymm2, word ptr [esp + 8*esi + 268435456]

				// CHECK: vbcstnesh2ps ymm2, word ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x94,0x87,0x23,0x01,0x00,0x00]
				vbcstnesh2ps ymm2, word ptr [edi + 4*eax + 291]

				// CHECK: vbcstnesh2ps ymm2, word ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x10]
				vbcstnesh2ps ymm2, word ptr [eax]

				// CHECK: vbcstnesh2ps ymm2, word ptr [2*ebp - 64]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnesh2ps ymm2, word ptr [2*ebp - 64]

				// CHECK: vbcstnesh2ps ymm2, word ptr [ecx + 254]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnesh2ps ymm2, word ptr [ecx + 254]

				// CHECK: vbcstnesh2ps ymm2, word ptr [edx - 256]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnesh2ps ymm2, word ptr [edx - 256]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneebf162ps xmm2, xmmword ptr [esp + 8*esi + 268435456]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneebf162ps xmm2, xmmword ptr [edi + 4*eax + 291]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x10]
				vcvtneebf162ps xmm2, xmmword ptr [eax]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [2*ebp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneebf162ps xmm2, xmmword ptr [2*ebp - 512]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [ecx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneebf162ps xmm2, xmmword ptr [ecx + 2032]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [edx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneebf162ps xmm2, xmmword ptr [edx - 2048]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneebf162ps ymm2, ymmword ptr [esp + 8*esi + 268435456]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneebf162ps ymm2, ymmword ptr [edi + 4*eax + 291]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x10]
				vcvtneebf162ps ymm2, ymmword ptr [eax]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [2*ebp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneebf162ps ymm2, ymmword ptr [2*ebp - 1024]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [ecx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneebf162ps ymm2, ymmword ptr [ecx + 4064]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [edx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneebf162ps ymm2, ymmword ptr [edx - 4096]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneeph2ps xmm2, xmmword ptr [esp + 8*esi + 268435456]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneeph2ps xmm2, xmmword ptr [edi + 4*eax + 291]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x10]
				vcvtneeph2ps xmm2, xmmword ptr [eax]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [2*ebp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneeph2ps xmm2, xmmword ptr [2*ebp - 512]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [ecx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneeph2ps xmm2, xmmword ptr [ecx + 2032]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [edx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneeph2ps xmm2, xmmword ptr [edx - 2048]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneeph2ps ymm2, ymmword ptr [esp + 8*esi + 268435456]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneeph2ps ymm2, ymmword ptr [edi + 4*eax + 291]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x10]
				vcvtneeph2ps ymm2, ymmword ptr [eax]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [2*ebp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneeph2ps ymm2, ymmword ptr [2*ebp - 1024]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [ecx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneeph2ps ymm2, ymmword ptr [ecx + 4064]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [edx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneeph2ps ymm2, ymmword ptr [edx - 4096]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneobf162ps xmm2, xmmword ptr [esp + 8*esi + 268435456]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneobf162ps xmm2, xmmword ptr [edi + 4*eax + 291]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x10]
				vcvtneobf162ps xmm2, xmmword ptr [eax]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [2*ebp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneobf162ps xmm2, xmmword ptr [2*ebp - 512]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [ecx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneobf162ps xmm2, xmmword ptr [ecx + 2032]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [edx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneobf162ps xmm2, xmmword ptr [edx - 2048]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneobf162ps ymm2, ymmword ptr [esp + 8*esi + 268435456]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneobf162ps ymm2, ymmword ptr [edi + 4*eax + 291]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x10]
				vcvtneobf162ps ymm2, ymmword ptr [eax]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [2*ebp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneobf162ps ymm2, ymmword ptr [2*ebp - 1024]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [ecx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneobf162ps ymm2, ymmword ptr [ecx + 4064]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [edx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneobf162ps ymm2, ymmword ptr [edx - 4096]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneoph2ps xmm2, xmmword ptr [esp + 8*esi + 268435456]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneoph2ps xmm2, xmmword ptr [edi + 4*eax + 291]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x10]
				vcvtneoph2ps xmm2, xmmword ptr [eax]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [2*ebp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneoph2ps xmm2, xmmword ptr [2*ebp - 512]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [ecx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneoph2ps xmm2, xmmword ptr [ecx + 2032]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [edx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneoph2ps xmm2, xmmword ptr [edx - 2048]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x94,0xf4,0x00,0x00,0x00,0x10]
				vcvtneoph2ps ymm2, ymmword ptr [esp + 8*esi + 268435456]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x94,0x87,0x23,0x01,0x00,0x00]
				vcvtneoph2ps ymm2, ymmword ptr [edi + 4*eax + 291]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x10]
				vcvtneoph2ps ymm2, ymmword ptr [eax]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [2*ebp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneoph2ps ymm2, ymmword ptr [2*ebp - 1024]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [ecx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneoph2ps ymm2, ymmword ptr [ecx + 4064]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [edx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneoph2ps ymm2, ymmword ptr [edx - 4096]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmm3
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0xd3]
				{vex} vcvtneps2bf16 xmm2, xmm3

				// CHECK: {vex} vcvtneps2bf16 xmm2, ymm3
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0xd3]
				{vex} vcvtneps2bf16 xmm2, ymm3

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [esp + 8*esi + 268435456]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x94,0xf4,0x00,0x00,0x00,0x10]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [esp + 8*esi + 268435456]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [edi + 4*eax + 291]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x94,0x87,0x23,0x01,0x00,0x00]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [edi + 4*eax + 291]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [eax]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x10]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [eax]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [2*ebp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x14,0x6d,0x00,0xfe,0xff,0xff]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [2*ebp - 512]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [ecx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x91,0xf0,0x07,0x00,0x00]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [ecx + 2032]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [edx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x92,0x00,0xf8,0xff,0xff]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [edx - 2048]

				// CHECK: {vex} vcvtneps2bf16 xmm2, ymmword ptr [2*ebp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x14,0x6d,0x00,0xfc,0xff,0xff]
				{vex} vcvtneps2bf16 xmm2, ymmword ptr [2*ebp - 1024]

				// CHECK: {vex} vcvtneps2bf16 xmm2, ymmword ptr [ecx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x91,0xe0,0x0f,0x00,0x00]
				{vex} vcvtneps2bf16 xmm2, ymmword ptr [ecx + 4064]

				// CHECK: {vex} vcvtneps2bf16 xmm2, ymmword ptr [edx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x92,0x00,0xf0,0xff,0xff]
				{vex} vcvtneps2bf16 xmm2, ymmword ptr [edx - 4096]

llvm/test/MC/X86/avx_ne_convert-64-att.s

This file was added.

				// RUN: llvm-mc -triple x86_64-unknown-unknown --show-encoding %s \| FileCheck %s

				// CHECK: vbcstnebf162ps 268435456(%rbp,%r14,8), %xmm2
				// CHECK: encoding: [0xc4,0xa2,0x7a,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10]
				vbcstnebf162ps 268435456(%rbp,%r14,8), %xmm2

				// CHECK: vbcstnebf162ps 291(%r8,%rax,4), %xmm2
				// CHECK: encoding: [0xc4,0xc2,0x7a,0xb1,0x94,0x80,0x23,0x01,0x00,0x00]
				vbcstnebf162ps 291(%r8,%rax,4), %xmm2

				// CHECK: vbcstnebf162ps (%rip), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x15,0x00,0x00,0x00,0x00]
				vbcstnebf162ps (%rip), %xmm2

				// CHECK: vbcstnebf162ps -64(,%rbp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnebf162ps -64(,%rbp,2), %xmm2

				// CHECK: vbcstnebf162ps 254(%rcx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnebf162ps 254(%rcx), %xmm2

				// CHECK: vbcstnebf162ps -256(%rdx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnebf162ps -256(%rdx), %xmm2

				// CHECK: vbcstnebf162ps 268435456(%rbp,%r14,8), %ymm2
				// CHECK: encoding: [0xc4,0xa2,0x7e,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10]
				vbcstnebf162ps 268435456(%rbp,%r14,8), %ymm2

				// CHECK: vbcstnebf162ps 291(%r8,%rax,4), %ymm2
				// CHECK: encoding: [0xc4,0xc2,0x7e,0xb1,0x94,0x80,0x23,0x01,0x00,0x00]
				vbcstnebf162ps 291(%r8,%rax,4), %ymm2

				// CHECK: vbcstnebf162ps (%rip), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x15,0x00,0x00,0x00,0x00]
				vbcstnebf162ps (%rip), %ymm2

				// CHECK: vbcstnebf162ps -64(,%rbp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnebf162ps -64(,%rbp,2), %ymm2

				// CHECK: vbcstnebf162ps 254(%rcx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnebf162ps 254(%rcx), %ymm2

				// CHECK: vbcstnebf162ps -256(%rdx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnebf162ps -256(%rdx), %ymm2

				// CHECK: vbcstnesh2ps 268435456(%rbp,%r14,8), %xmm2
				// CHECK: encoding: [0xc4,0xa2,0x79,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10]
				vbcstnesh2ps 268435456(%rbp,%r14,8), %xmm2

				// CHECK: vbcstnesh2ps 291(%r8,%rax,4), %xmm2
				// CHECK: encoding: [0xc4,0xc2,0x79,0xb1,0x94,0x80,0x23,0x01,0x00,0x00]
				vbcstnesh2ps 291(%r8,%rax,4), %xmm2

				// CHECK: vbcstnesh2ps (%rip), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x15,0x00,0x00,0x00,0x00]
				vbcstnesh2ps (%rip), %xmm2

				// CHECK: vbcstnesh2ps -64(,%rbp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnesh2ps -64(,%rbp,2), %xmm2

				// CHECK: vbcstnesh2ps 254(%rcx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnesh2ps 254(%rcx), %xmm2

				// CHECK: vbcstnesh2ps -256(%rdx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnesh2ps -256(%rdx), %xmm2

				// CHECK: vbcstnesh2ps 268435456(%rbp,%r14,8), %ymm2
				// CHECK: encoding: [0xc4,0xa2,0x7d,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10]
				vbcstnesh2ps 268435456(%rbp,%r14,8), %ymm2

				// CHECK: vbcstnesh2ps 291(%r8,%rax,4), %ymm2
				// CHECK: encoding: [0xc4,0xc2,0x7d,0xb1,0x94,0x80,0x23,0x01,0x00,0x00]
				vbcstnesh2ps 291(%r8,%rax,4), %ymm2

				// CHECK: vbcstnesh2ps (%rip), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x15,0x00,0x00,0x00,0x00]
				vbcstnesh2ps (%rip), %ymm2

				// CHECK: vbcstnesh2ps -64(,%rbp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnesh2ps -64(,%rbp,2), %ymm2

				// CHECK: vbcstnesh2ps 254(%rcx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnesh2ps 254(%rcx), %ymm2

				// CHECK: vbcstnesh2ps -256(%rdx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnesh2ps -256(%rdx), %ymm2

				// CHECK: vcvtneebf162ps 268435456(%rbp,%r14,8), %xmm2
				// CHECK: encoding: [0xc4,0xa2,0x7a,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneebf162ps 268435456(%rbp,%r14,8), %xmm2

				// CHECK: vcvtneebf162ps 291(%r8,%rax,4), %xmm2
				// CHECK: encoding: [0xc4,0xc2,0x7a,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneebf162ps 291(%r8,%rax,4), %xmm2

				// CHECK: vcvtneebf162ps (%rip), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneebf162ps (%rip), %xmm2

				// CHECK: vcvtneebf162ps -512(,%rbp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneebf162ps -512(,%rbp,2), %xmm2

				// CHECK: vcvtneebf162ps 2032(%rcx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneebf162ps 2032(%rcx), %xmm2

				// CHECK: vcvtneebf162ps -2048(%rdx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneebf162ps -2048(%rdx), %xmm2

				// CHECK: vcvtneebf162ps 268435456(%rbp,%r14,8), %ymm2
				// CHECK: encoding: [0xc4,0xa2,0x7e,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneebf162ps 268435456(%rbp,%r14,8), %ymm2

				// CHECK: vcvtneebf162ps 291(%r8,%rax,4), %ymm2
				// CHECK: encoding: [0xc4,0xc2,0x7e,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneebf162ps 291(%r8,%rax,4), %ymm2

				// CHECK: vcvtneebf162ps (%rip), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneebf162ps (%rip), %ymm2

				// CHECK: vcvtneebf162ps -1024(,%rbp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneebf162ps -1024(,%rbp,2), %ymm2

				// CHECK: vcvtneebf162ps 4064(%rcx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneebf162ps 4064(%rcx), %ymm2

				// CHECK: vcvtneebf162ps -4096(%rdx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneebf162ps -4096(%rdx), %ymm2

				// CHECK: vcvtneeph2ps 268435456(%rbp,%r14,8), %xmm2
				// CHECK: encoding: [0xc4,0xa2,0x79,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneeph2ps 268435456(%rbp,%r14,8), %xmm2

				// CHECK: vcvtneeph2ps 291(%r8,%rax,4), %xmm2
				// CHECK: encoding: [0xc4,0xc2,0x79,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneeph2ps 291(%r8,%rax,4), %xmm2

				// CHECK: vcvtneeph2ps (%rip), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneeph2ps (%rip), %xmm2

				// CHECK: vcvtneeph2ps -512(,%rbp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneeph2ps -512(,%rbp,2), %xmm2

				// CHECK: vcvtneeph2ps 2032(%rcx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneeph2ps 2032(%rcx), %xmm2

				// CHECK: vcvtneeph2ps -2048(%rdx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneeph2ps -2048(%rdx), %xmm2

				// CHECK: vcvtneeph2ps 268435456(%rbp,%r14,8), %ymm2
				// CHECK: encoding: [0xc4,0xa2,0x7d,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneeph2ps 268435456(%rbp,%r14,8), %ymm2

				// CHECK: vcvtneeph2ps 291(%r8,%rax,4), %ymm2
				// CHECK: encoding: [0xc4,0xc2,0x7d,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneeph2ps 291(%r8,%rax,4), %ymm2

				// CHECK: vcvtneeph2ps (%rip), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneeph2ps (%rip), %ymm2

				// CHECK: vcvtneeph2ps -1024(,%rbp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneeph2ps -1024(,%rbp,2), %ymm2

				// CHECK: vcvtneeph2ps 4064(%rcx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneeph2ps 4064(%rcx), %ymm2

				// CHECK: vcvtneeph2ps -4096(%rdx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneeph2ps -4096(%rdx), %ymm2

				// CHECK: vcvtneobf162ps 268435456(%rbp,%r14,8), %xmm2
				// CHECK: encoding: [0xc4,0xa2,0x7b,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneobf162ps 268435456(%rbp,%r14,8), %xmm2

				// CHECK: vcvtneobf162ps 291(%r8,%rax,4), %xmm2
				// CHECK: encoding: [0xc4,0xc2,0x7b,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneobf162ps 291(%r8,%rax,4), %xmm2

				// CHECK: vcvtneobf162ps (%rip), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneobf162ps (%rip), %xmm2

				// CHECK: vcvtneobf162ps -512(,%rbp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneobf162ps -512(,%rbp,2), %xmm2

				// CHECK: vcvtneobf162ps 2032(%rcx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneobf162ps 2032(%rcx), %xmm2

				// CHECK: vcvtneobf162ps -2048(%rdx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneobf162ps -2048(%rdx), %xmm2

				// CHECK: vcvtneobf162ps 268435456(%rbp,%r14,8), %ymm2
				// CHECK: encoding: [0xc4,0xa2,0x7f,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneobf162ps 268435456(%rbp,%r14,8), %ymm2

				// CHECK: vcvtneobf162ps 291(%r8,%rax,4), %ymm2
				// CHECK: encoding: [0xc4,0xc2,0x7f,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneobf162ps 291(%r8,%rax,4), %ymm2

				// CHECK: vcvtneobf162ps (%rip), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneobf162ps (%rip), %ymm2

				// CHECK: vcvtneobf162ps -1024(,%rbp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneobf162ps -1024(,%rbp,2), %ymm2

				// CHECK: vcvtneobf162ps 4064(%rcx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneobf162ps 4064(%rcx), %ymm2

				// CHECK: vcvtneobf162ps -4096(%rdx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneobf162ps -4096(%rdx), %ymm2

				// CHECK: vcvtneoph2ps 268435456(%rbp,%r14,8), %xmm2
				// CHECK: encoding: [0xc4,0xa2,0x78,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneoph2ps 268435456(%rbp,%r14,8), %xmm2

				// CHECK: vcvtneoph2ps 291(%r8,%rax,4), %xmm2
				// CHECK: encoding: [0xc4,0xc2,0x78,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneoph2ps 291(%r8,%rax,4), %xmm2

				// CHECK: vcvtneoph2ps (%rip), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneoph2ps (%rip), %xmm2

				// CHECK: vcvtneoph2ps -512(,%rbp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneoph2ps -512(,%rbp,2), %xmm2

				// CHECK: vcvtneoph2ps 2032(%rcx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneoph2ps 2032(%rcx), %xmm2

				// CHECK: vcvtneoph2ps -2048(%rdx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneoph2ps -2048(%rdx), %xmm2

				// CHECK: vcvtneoph2ps 268435456(%rbp,%r14,8), %ymm2
				// CHECK: encoding: [0xc4,0xa2,0x7c,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneoph2ps 268435456(%rbp,%r14,8), %ymm2

				// CHECK: vcvtneoph2ps 291(%r8,%rax,4), %ymm2
				// CHECK: encoding: [0xc4,0xc2,0x7c,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneoph2ps 291(%r8,%rax,4), %ymm2

				// CHECK: vcvtneoph2ps (%rip), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneoph2ps (%rip), %ymm2

				// CHECK: vcvtneoph2ps -1024(,%rbp,2), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneoph2ps -1024(,%rbp,2), %ymm2

				// CHECK: vcvtneoph2ps 4064(%rcx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneoph2ps 4064(%rcx), %ymm2

				// CHECK: vcvtneoph2ps -4096(%rdx), %ymm2
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneoph2ps -4096(%rdx), %ymm2

				// CHECK: {vex} vcvtneps2bf16 %xmm3, %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0xd3]
				{vex} vcvtneps2bf16 %xmm3, %xmm2

				// CHECK: {vex} vcvtneps2bf16 %ymm3, %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0xd3]
				{vex} vcvtneps2bf16 %ymm3, %xmm2

				// CHECK: {vex} vcvtneps2bf16x 268435456(%rbp,%r14,8), %xmm2
				// CHECK: encoding: [0xc4,0xa2,0x7a,0x72,0x94,0xf5,0x00,0x00,0x00,0x10]
				{vex} vcvtneps2bf16x 268435456(%rbp,%r14,8), %xmm2

				// CHECK: {vex} vcvtneps2bf16x 291(%r8,%rax,4), %xmm2
				// CHECK: encoding: [0xc4,0xc2,0x7a,0x72,0x94,0x80,0x23,0x01,0x00,0x00]
				{vex} vcvtneps2bf16x 291(%r8,%rax,4), %xmm2

				// CHECK: {vex} vcvtneps2bf16x (%rip), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x15,0x00,0x00,0x00,0x00]
				{vex} vcvtneps2bf16x (%rip), %xmm2

				// CHECK: {vex} vcvtneps2bf16x -512(,%rbp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x14,0x6d,0x00,0xfe,0xff,0xff]
				{vex} vcvtneps2bf16x -512(,%rbp,2), %xmm2

				// CHECK: {vex} vcvtneps2bf16x 2032(%rcx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x91,0xf0,0x07,0x00,0x00]
				{vex} vcvtneps2bf16x 2032(%rcx), %xmm2

				// CHECK: {vex} vcvtneps2bf16x -2048(%rdx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x92,0x00,0xf8,0xff,0xff]
				{vex} vcvtneps2bf16x -2048(%rdx), %xmm2

				// CHECK: {vex} vcvtneps2bf16y -1024(,%rbp,2), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x14,0x6d,0x00,0xfc,0xff,0xff]
				{vex} vcvtneps2bf16y -1024(,%rbp,2), %xmm2

				// CHECK: {vex} vcvtneps2bf16y 4064(%rcx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x91,0xe0,0x0f,0x00,0x00]
				{vex} vcvtneps2bf16y 4064(%rcx), %xmm2

				// CHECK: {vex} vcvtneps2bf16y -4096(%rdx), %xmm2
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x92,0x00,0xf0,0xff,0xff]
				{vex} vcvtneps2bf16y -4096(%rdx), %xmm2

llvm/test/MC/X86/avx_ne_convert-64-intel.s

This file was added.

				// RUN: llvm-mc -triple x86_64-unknown-unknown -x86-asm-syntax=intel -output-asm-variant=1 --show-encoding %s \| FileCheck %s

				// CHECK: vbcstnebf162ps xmm2, word ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7a,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10]
				vbcstnebf162ps xmm2, word ptr [rbp + 8*r14 + 268435456]

				// CHECK: vbcstnebf162ps xmm2, word ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7a,0xb1,0x94,0x80,0x23,0x01,0x00,0x00]
				vbcstnebf162ps xmm2, word ptr [r8 + 4*rax + 291]

				// CHECK: vbcstnebf162ps xmm2, word ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x15,0x00,0x00,0x00,0x00]
				vbcstnebf162ps xmm2, word ptr [rip]

				// CHECK: vbcstnebf162ps xmm2, word ptr [2*rbp - 64]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnebf162ps xmm2, word ptr [2*rbp - 64]

				// CHECK: vbcstnebf162ps xmm2, word ptr [rcx + 254]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnebf162ps xmm2, word ptr [rcx + 254]

				// CHECK: vbcstnebf162ps xmm2, word ptr [rdx - 256]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnebf162ps xmm2, word ptr [rdx - 256]

				// CHECK: vbcstnebf162ps ymm2, word ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7e,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10]
				vbcstnebf162ps ymm2, word ptr [rbp + 8*r14 + 268435456]

				// CHECK: vbcstnebf162ps ymm2, word ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7e,0xb1,0x94,0x80,0x23,0x01,0x00,0x00]
				vbcstnebf162ps ymm2, word ptr [r8 + 4*rax + 291]

				// CHECK: vbcstnebf162ps ymm2, word ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x15,0x00,0x00,0x00,0x00]
				vbcstnebf162ps ymm2, word ptr [rip]

				// CHECK: vbcstnebf162ps ymm2, word ptr [2*rbp - 64]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnebf162ps ymm2, word ptr [2*rbp - 64]

				// CHECK: vbcstnebf162ps ymm2, word ptr [rcx + 254]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnebf162ps ymm2, word ptr [rcx + 254]

				// CHECK: vbcstnebf162ps ymm2, word ptr [rdx - 256]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnebf162ps ymm2, word ptr [rdx - 256]

				// CHECK: vbcstnesh2ps xmm2, word ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x79,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10]
				vbcstnesh2ps xmm2, word ptr [rbp + 8*r14 + 268435456]

				// CHECK: vbcstnesh2ps xmm2, word ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x79,0xb1,0x94,0x80,0x23,0x01,0x00,0x00]
				vbcstnesh2ps xmm2, word ptr [r8 + 4*rax + 291]

				// CHECK: vbcstnesh2ps xmm2, word ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x15,0x00,0x00,0x00,0x00]
				vbcstnesh2ps xmm2, word ptr [rip]

				// CHECK: vbcstnesh2ps xmm2, word ptr [2*rbp - 64]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnesh2ps xmm2, word ptr [2*rbp - 64]

				// CHECK: vbcstnesh2ps xmm2, word ptr [rcx + 254]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnesh2ps xmm2, word ptr [rcx + 254]

				// CHECK: vbcstnesh2ps xmm2, word ptr [rdx - 256]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnesh2ps xmm2, word ptr [rdx - 256]

				// CHECK: vbcstnesh2ps ymm2, word ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7d,0xb1,0x94,0xf5,0x00,0x00,0x00,0x10]
				vbcstnesh2ps ymm2, word ptr [rbp + 8*r14 + 268435456]

				// CHECK: vbcstnesh2ps ymm2, word ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7d,0xb1,0x94,0x80,0x23,0x01,0x00,0x00]
				vbcstnesh2ps ymm2, word ptr [r8 + 4*rax + 291]

				// CHECK: vbcstnesh2ps ymm2, word ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x15,0x00,0x00,0x00,0x00]
				vbcstnesh2ps ymm2, word ptr [rip]

				// CHECK: vbcstnesh2ps ymm2, word ptr [2*rbp - 64]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x14,0x6d,0xc0,0xff,0xff,0xff]
				vbcstnesh2ps ymm2, word ptr [2*rbp - 64]

				// CHECK: vbcstnesh2ps ymm2, word ptr [rcx + 254]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x91,0xfe,0x00,0x00,0x00]
				vbcstnesh2ps ymm2, word ptr [rcx + 254]

				// CHECK: vbcstnesh2ps ymm2, word ptr [rdx - 256]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb1,0x92,0x00,0xff,0xff,0xff]
				vbcstnesh2ps ymm2, word ptr [rdx - 256]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7a,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneebf162ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7a,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneebf162ps xmm2, xmmword ptr [r8 + 4*rax + 291]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneebf162ps xmm2, xmmword ptr [rip]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [2*rbp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneebf162ps xmm2, xmmword ptr [2*rbp - 512]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [rcx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneebf162ps xmm2, xmmword ptr [rcx + 2032]

				// CHECK: vcvtneebf162ps xmm2, xmmword ptr [rdx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneebf162ps xmm2, xmmword ptr [rdx - 2048]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7e,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneebf162ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7e,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneebf162ps ymm2, ymmword ptr [r8 + 4*rax + 291]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneebf162ps ymm2, ymmword ptr [rip]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [2*rbp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneebf162ps ymm2, ymmword ptr [2*rbp - 1024]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [rcx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneebf162ps ymm2, ymmword ptr [rcx + 4064]

				// CHECK: vcvtneebf162ps ymm2, ymmword ptr [rdx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneebf162ps ymm2, ymmword ptr [rdx - 4096]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x79,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneeph2ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x79,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneeph2ps xmm2, xmmword ptr [r8 + 4*rax + 291]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneeph2ps xmm2, xmmword ptr [rip]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [2*rbp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneeph2ps xmm2, xmmword ptr [2*rbp - 512]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [rcx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneeph2ps xmm2, xmmword ptr [rcx + 2032]

				// CHECK: vcvtneeph2ps xmm2, xmmword ptr [rdx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x79,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneeph2ps xmm2, xmmword ptr [rdx - 2048]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7d,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneeph2ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7d,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneeph2ps ymm2, ymmword ptr [r8 + 4*rax + 291]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneeph2ps ymm2, ymmword ptr [rip]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [2*rbp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneeph2ps ymm2, ymmword ptr [2*rbp - 1024]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [rcx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneeph2ps ymm2, ymmword ptr [rcx + 4064]

				// CHECK: vcvtneeph2ps ymm2, ymmword ptr [rdx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7d,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneeph2ps ymm2, ymmword ptr [rdx - 4096]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7b,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneobf162ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7b,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneobf162ps xmm2, xmmword ptr [r8 + 4*rax + 291]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneobf162ps xmm2, xmmword ptr [rip]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [2*rbp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneobf162ps xmm2, xmmword ptr [2*rbp - 512]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [rcx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneobf162ps xmm2, xmmword ptr [rcx + 2032]

				// CHECK: vcvtneobf162ps xmm2, xmmword ptr [rdx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x7b,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneobf162ps xmm2, xmmword ptr [rdx - 2048]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7f,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneobf162ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7f,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneobf162ps ymm2, ymmword ptr [r8 + 4*rax + 291]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneobf162ps ymm2, ymmword ptr [rip]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [2*rbp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneobf162ps ymm2, ymmword ptr [2*rbp - 1024]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [rcx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneobf162ps ymm2, ymmword ptr [rcx + 4064]

				// CHECK: vcvtneobf162ps ymm2, ymmword ptr [rdx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7f,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneobf162ps ymm2, ymmword ptr [rdx - 4096]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x78,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneoph2ps xmm2, xmmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x78,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneoph2ps xmm2, xmmword ptr [r8 + 4*rax + 291]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneoph2ps xmm2, xmmword ptr [rip]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [2*rbp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x14,0x6d,0x00,0xfe,0xff,0xff]
				vcvtneoph2ps xmm2, xmmword ptr [2*rbp - 512]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [rcx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x91,0xf0,0x07,0x00,0x00]
				vcvtneoph2ps xmm2, xmmword ptr [rcx + 2032]

				// CHECK: vcvtneoph2ps xmm2, xmmword ptr [rdx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x78,0xb0,0x92,0x00,0xf8,0xff,0xff]
				vcvtneoph2ps xmm2, xmmword ptr [rdx - 2048]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7c,0xb0,0x94,0xf5,0x00,0x00,0x00,0x10]
				vcvtneoph2ps ymm2, ymmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7c,0xb0,0x94,0x80,0x23,0x01,0x00,0x00]
				vcvtneoph2ps ymm2, ymmword ptr [r8 + 4*rax + 291]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x15,0x00,0x00,0x00,0x00]
				vcvtneoph2ps ymm2, ymmword ptr [rip]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [2*rbp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x14,0x6d,0x00,0xfc,0xff,0xff]
				vcvtneoph2ps ymm2, ymmword ptr [2*rbp - 1024]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [rcx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x91,0xe0,0x0f,0x00,0x00]
				vcvtneoph2ps ymm2, ymmword ptr [rcx + 4064]

				// CHECK: vcvtneoph2ps ymm2, ymmword ptr [rdx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7c,0xb0,0x92,0x00,0xf0,0xff,0xff]
				vcvtneoph2ps ymm2, ymmword ptr [rdx - 4096]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmm3
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0xd3]
				{vex} vcvtneps2bf16 xmm2, xmm3

				// CHECK: {vex} vcvtneps2bf16 xmm2, ymm3
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0xd3]
				{vex} vcvtneps2bf16 xmm2, ymm3

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [rbp + 8*r14 + 268435456]
				// CHECK: encoding: [0xc4,0xa2,0x7a,0x72,0x94,0xf5,0x00,0x00,0x00,0x10]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [rbp + 8*r14 + 268435456]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [r8 + 4*rax + 291]
				// CHECK: encoding: [0xc4,0xc2,0x7a,0x72,0x94,0x80,0x23,0x01,0x00,0x00]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [r8 + 4*rax + 291]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [rip]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x15,0x00,0x00,0x00,0x00]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [rip]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [2*rbp - 512]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x14,0x6d,0x00,0xfe,0xff,0xff]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [2*rbp - 512]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [rcx + 2032]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x91,0xf0,0x07,0x00,0x00]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [rcx + 2032]

				// CHECK: {vex} vcvtneps2bf16 xmm2, xmmword ptr [rdx - 2048]
				// CHECK: encoding: [0xc4,0xe2,0x7a,0x72,0x92,0x00,0xf8,0xff,0xff]
				{vex} vcvtneps2bf16 xmm2, xmmword ptr [rdx - 2048]

				// CHECK: {vex} vcvtneps2bf16 xmm2, ymmword ptr [2*rbp - 1024]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x14,0x6d,0x00,0xfc,0xff,0xff]
				{vex} vcvtneps2bf16 xmm2, ymmword ptr [2*rbp - 1024]

				// CHECK: {vex} vcvtneps2bf16 xmm2, ymmword ptr [rcx + 4064]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x91,0xe0,0x0f,0x00,0x00]
				{vex} vcvtneps2bf16 xmm2, ymmword ptr [rcx + 4064]

				// CHECK: {vex} vcvtneps2bf16 xmm2, ymmword ptr [rdx - 4096]
				// CHECK: encoding: [0xc4,0xe2,0x7e,0x72,0x92,0x00,0xf0,0xff,0xff]
				{vex} vcvtneps2bf16 xmm2, ymmword ptr [rdx - 4096]

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add AVX-NE-CONVERT instructions.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 472030

clang/docs/ReleaseNotes.rst

clang/include/clang/Basic/BuiltinsX86.def

clang/include/clang/Driver/Options.td

clang/lib/Basic/Targets/X86.h

clang/lib/Basic/Targets/X86.cpp

clang/lib/Headers/CMakeLists.txt

clang/lib/Headers/avx512vlbf16intrin.h

clang/lib/Headers/avxneconvertintrin.h

clang/lib/Headers/cpuid.h

clang/lib/Headers/immintrin.h

clang/test/CodeGen/X86/avx512vlbf16-builtins.c

clang/test/CodeGen/X86/avxneconvert-builtins.c

clang/test/CodeGen/attr-target-x86.c

clang/test/Driver/x86-target-features.c

clang/test/Preprocessor/x86_target_features.c

llvm/docs/ReleaseNotes.rst

llvm/include/llvm/IR/IntrinsicsX86.td

llvm/include/llvm/Support/X86TargetParser.def

llvm/lib/Support/Host.cpp

llvm/lib/Support/X86TargetParser.cpp

llvm/lib/Target/X86/X86.td

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/lib/Target/X86/X86InstrAVX512.td

llvm/lib/Target/X86/X86InstrInfo.td

llvm/lib/Target/X86/X86InstrSSE.td

llvm/test/CodeGen/X86/avxneconvert-intrinsics-shared.ll

llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll

llvm/test/MC/Disassembler/X86/avx_ne_convert-32.txt

llvm/test/MC/Disassembler/X86/avx_ne_convert-64.txt

llvm/test/MC/X86/avx_ne_convert-32-att.s

llvm/test/MC/X86/avx_ne_convert-32-intel.s

llvm/test/MC/X86/avx_ne_convert-64-att.s

llvm/test/MC/X86/avx_ne_convert-64-intel.s

[X86] Add AVX-NE-CONVERT instructions.
ClosedPublic