For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Basic/Targets/X86.cpp | ||
---|---|---|
795 | Do we need it here? |
clang/test/CodeGen/X86/avxneconvert-builtins.c | ||
---|---|---|
3 | 32-bit test coverage? |
llvm/test/MC/X86/avx-ne-convert-att.s | ||
---|---|---|
1 ↗ | (On Diff #468158) | merge the att + intel test files and use --check-prefixes to test both |
merge att/intel test coverage files and rename the 32/64 bit files so that they are close together in the file lists
clang/lib/Headers/immintrin.h | ||
---|---|---|
262 | I have moved FP16/BF16 vector types out of original header files. rGe0fb01e9 |
Possibly rename the x86-64-* test files to *-64 (and *-32 equivalent) so that the 32/64 bit files are closer together for tracking (and to help avoid bitrot).
clang/lib/Headers/immintrin.h | ||
---|---|---|
262 | Update to this? #if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) || \ (defined(__AVXNECONVERT__) && defined(__AVX512FP16__)) | |
llvm/test/MC/X86/x86-64-avx-ne-convert-att.s | ||
1 ↗ | (On Diff #468158) | x86-64-avx-ne-convert-intel.s ? |
llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll | ||
---|---|---|
5 | Need to add +avx512bf16,+avx512vl tests for shared builtin intrinsic. I just found it crashed for lacking new patterns for avx512bf16. I'll update ASAP. |
clang/lib/Headers/avx512vlbf16intrin.h | ||
---|---|---|
164 | Is there no way for attribute to allow different attribute permutations? Also, can we keep the __builtin_ia32_cvtneps2bf16_128 naming convention? |
clang/lib/Headers/avx512vlbf16intrin.h | ||
---|---|---|
164 |
We have discussed this problem with GCC folks. There are two problems here:
|
clang/include/clang/Driver/Options.td | ||
---|---|---|
4599–4600 | Need to move it before mavxvnniint8 . | |
clang/lib/Basic/Targets/X86.cpp | ||
1034 | Move it ahead. | |
clang/lib/Headers/avx512vlbf16intrin.h | ||
164 | It's better to use __builtin_ia32_cvtneps2bf16_128. | |
clang/lib/Headers/avxneconvertintrin.h | ||
107 | VBCSTNESH2PS | |
140 | VBCSTNESH2PS | |
208 | 16 | |
274 | 16 | |
340 | 16 | |
406 | 16 | |
clang/test/Preprocessor/x86_target_features.c | ||
593–599 | Should we check __AVX2__ like we did for AVXVNNI? | |
llvm/lib/Support/Host.cpp | ||
1819 | Move it ahead and remove the blank line. | |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
2181–2198 | How about merge it here? | |
llvm/lib/Target/X86/X86InstrSSE.td | ||
8260–8261 | This can be f16 mem now. | |
8264–8265 | f128mem, f256mem | |
8268–8269 | ditto. | |
llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll | ||
129–140 ↗ | (On Diff #471710) | You don't need to add them here, just another RUN in below file should be enough, e.g., ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=x86_64-unknown-unknown --show-mc-encoding -mattr=+avx512bf16,+avx512vl | FileCheck %s --check-prefix=AVX512BF16 ; RUN: llc < %s -O0 -verify-machineinstrs -mtriple=i686-unknown-unknown --show-mc-encoding -mattr=+avx512bf16,+avx512vl | FileCheck %s --check-prefix=AVX512BF16 |
llvm/test/CodeGen/X86/avxneconvert-intrinsics.ll | ||
3 | --check-prefixes=CHECK,X64 | |
4 | --check-prefixes=CHECK,X86 |
clang/lib/Headers/avx512vlbf16intrin.h | ||
---|---|---|
164 | I think __builtin_ia32_vcvtneps2bf16128 is also a "right" name. See builtin_ia32_vfmaddsubph256, builtin_ia32_minph256... And I admit naming conventions of clang builtins as well as LLVM IR builtins are confusing right now. |
clang/lib/Headers/avx512vlbf16intrin.h | ||
---|---|---|
164 | The problem here is 16128 is a bit confusing, a _ breaks it into 2 number. |
clang/lib/Headers/avx512vlbf16intrin.h | ||
---|---|---|
164 | I did a try but found __builtin_ia32_cvtneps2bf16_256 existed for avx512bf16, and it's used for mask intrinsic lowering currently. What about not change this time? We can do a refine patch later for avx512bf16 builtins since they also have some redundant FE/codegen logics for 256/512 mask intrinsics. |
These should be shared with AVX512-BF16.