This is an archive of the discontinued LLVM Phabricator instance.

[X86] Support Intel avxvnni
ClosedPublic

Authored by LiuChen3 on Oct 9 2020, 2:04 AM.

Details

Summary

This patch mainly made the following changes:

  1. Support AVX-VNNI instructions;
  2. Support more complex FeatureList. By default, ',' is prior than '|'. And parentheses is supported;
  3. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.

Diff Detail

Event Timeline

LiuChen3 created this revision.Oct 9 2020, 2:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 9 2020, 2:04 AM
LiuChen3 requested review of this revision.Oct 9 2020, 2:04 AM

Please can you add an entry to the 12.0 release notes?

clang/include/clang/Basic/BuiltinsNVPTX.def
46 ↗(On Diff #297148)

Pull this out? This and the CodeGenFunction.* changes seem to be worth a separate patch

clang/lib/Headers/avxvnniintrin.h
32

Is having a commonvnniintrin.h header the approach gcc/icc are taking?

40

Please can you add doxygen descriptions to each intrinsic.

LiuChen3 updated this revision to Diff 297392.Oct 10 2020, 12:56 AM

Address comments.

LiuChen3 marked an inline comment as done.Oct 10 2020, 1:17 AM
clang/include/clang/Basic/BuiltinsNVPTX.def
46 ↗(On Diff #297148)

Thanks for your review.
I will separate this part. I want to keep this part here for now until another patch is merged in.

clang/lib/Headers/avxvnniintrin.h
32

GCC doesn't have commonvnniintrin.h headr, they only have avxvnniintrin.h. For ICC, I will check with this and give you answer later.

clang/lib/Headers/avxvnniintrin.h
32

GCC and ICC doesn't break intrinsics into multiple headers.

RKSimon added inline comments.Oct 22 2020, 2:14 AM
clang/lib/Headers/avxvnniintrin.h
32

Is there anyway that we can avoid this as well then? I get worried when header layouts start to diverge...

We also need to add avxvnni versions of the patterns from this commit along with appropriate testing. Maybe just split the 128/256 bit out of avx512vnni.ll and into a separate test with multiple RUN lines. That should all be done in a separate patch.

{code}
commit cc4b0596b1b0e58672e4151396c7b804eccaf273
Author: Craig Topper <craig.topper@intel.com>
Date: Sat Aug 24 23:14:57 2019 +0000

[X86] Add isel patterns to match vpdpwssd avx512vnni instruction from add+pmaddwd nodes.

{code}

pengfei added inline comments.Oct 22 2020, 6:06 PM
clang/lib/Headers/avxvnniintrin.h
32

Hi Simon, why we need to keep the same layout with GCC and ICC?
We always create new header files for new instructions. We don't allow user include these headers directly. So I think it doesn't matter if GCC and ICC create the same headers or not.

craig.topper added inline comments.Oct 22 2020, 6:23 PM
clang/lib/Headers/avxvnniintrin.h
32

As far as I know our header layout is largely the same as gcc for existing intrinsics. Can we just keep them the same?

clang/lib/Headers/commonvnniintrin.h
48 ↗(On Diff #298920)

Arguments to macros shouldn't start with __

49 ↗(On Diff #298920)

Because this is a macro this needs to have parentheses around S, A, and B.

pengfei added inline comments.Oct 22 2020, 10:35 PM
clang/lib/Headers/avxvnniintrin.h
33

Maybe we can move commonvnniintrin.h to immintrin.h. Then we don't have a different layout.

#if !(defined(_MSC_VER) || defined(__SCE__)) || __has_feature(modules) ||      \
     defined(__AVX512VNNI__) || (defined(__AVX512VL__) && defined(__AVX512VNNI__))
#include <commonvnniintrin.h>
#endif
craig.topper added inline comments.Oct 22 2020, 10:39 PM
clang/lib/Headers/avxvnniintrin.h
33

gcc appears to have just put the common intrinsics in avxvnniintrin.h. Why can't we do that?

pengfei added inline comments.Oct 22 2020, 10:52 PM
clang/lib/Headers/avxvnniintrin.h
33

I asked GCC guys. They don't use macro like defined(__AVXVNNI__). If we put it in avxvnniintrin.h, we need handle macro AVX512VNNI in avxvnniintrin.h. It's bit confused.

@LiuChen3 I meant __AVXVNNI__ for the first __AVX512VNNI__ in last comment.

pengfei added inline comments.Oct 22 2020, 11:36 PM
clang/lib/Headers/avxvnniintrin.h
33

Or maybe we can put the common intrinsics in avx512vlvnniintrin.h and not to handle macro __AVXVNNI__ int it.
I think it's reasonable. For Linux, we always include all headers. For Windows, we can only allow user use the _avx_ intrinscis for AVXVNNI, which is compatible with MSVC.

craig.topper added inline comments.Oct 22 2020, 11:40 PM
clang/lib/Headers/avxvnniintrin.h
33

Minor clarification. It's not Linux vs Windows. Its -ms-compatibility that controls it. You can use clang on Windows without enabling MSVC compatibility. In which case MSC_VER won't be defined.

pengfei added inline comments.Oct 22 2020, 11:42 PM
clang/lib/Headers/avxvnniintrin.h
33

Right. Thank @craig.topper

RKSimon added inline comments.Oct 23 2020, 1:19 AM
clang/test/CodeGen/avxvnni-builtins.c
1 ↗(On Diff #298920)

Please can you move this test file under the X86 subdirectory?

LiuChen3 updated this revision to Diff 300564.Oct 25 2020, 6:35 PM
  1. move the commonvnniintrin.h to the avx512vlvnniintrin.h.
  2. move the testcase avxvnni-builtins.c to X86 subdirectory.
craig.topper added inline comments.
llvm/lib/Support/X86TargetParser.cpp
211

Add here too.

llvm/lib/Target/X86/X86.td
775

Please add this to Alderlake which was added to X86.td by @bkramer yesterday

Also need clang/test/Preprocessor/predefined-arch-macros.c updates for sapphirerapids and alderlake to check that AVXVNNI is defined.

LiuChen3 updated this revision to Diff 300569.Oct 25 2020, 7:58 PM

Adding avxvnni to Alder Lake.

pengfei added inline comments.Oct 25 2020, 8:04 PM
clang/test/Preprocessor/predefined-arch-macros.c
1637

You need to add to sapphirerapids too.

1708

You need to add to sapphirerapids too.

llvm/docs/ReleaseNotes.rst
124

You also need to add it to Clang release notes.

LiuChen3 updated this revision to Diff 300570.Oct 25 2020, 8:24 PM

Address comments.

pengfei added inline comments.Oct 26 2020, 12:17 AM
clang/lib/Headers/avxvnniintrin.h
31

I think we may still need add some information for intrinsics that can be used for AVXVNNI but not declared here. A proposal might be:

/* Below intrinsics defined in avx512vlvnniintrin.h can be used for */
/* AVXVNNI except used for compatibility with msvc.                 */
// #define _mm256_dpbusd_epi32(S, A, B)
// #define _mm256_dpbusds_epi32(S, A, B)
// #define _mm256_dpwssd_epi32(S, A, B)
...
pengfei added inline comments.Oct 29 2020, 7:38 PM
clang/lib/Headers/avxvnniintrin.h
31

I had a look at the Doxygen grammar. Maybe we can use like below:

/* Below intrinsics defined in avx512vlvnniintrin.h can be used for */
/* AVXVNNI except used for compatibility with msvc.                 */
/// \fn __m256i _mm256_dpbusd_epi32(__m256i __S, __m256i __A, __m256i __B)
/// \fn __m256i _mm256_dpbusds_epi32(__m256i __S, __m256i __A, __m256i __B)
/// \fn __m256i _mm256_dpwssd_epi32(__m256i __S, __m256i __A, __m256i __B)
...
LiuChen3 updated this revision to Diff 301814.Oct 29 2020, 10:02 PM
pengfei added inline comments.Oct 29 2020, 10:21 PM
clang/lib/Headers/avxvnniintrin.h
31

Missing one line comment?

clang/lib/Headers/immintrin.h
144

Change back the spaces.

llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3851

Missing check for VEX2?

LiuChen3 updated this revision to Diff 301816.Oct 29 2020, 10:50 PM

Address comments

pengfei accepted this revision.Oct 29 2020, 11:06 PM

LGTM with little nit:

llvm/test/CodeGen/X86/avx-vnni/avx_vnni-intrinsics.ll
1

Do we need new directory for the tests?

3

I don't see difference between AVX-X86 and AVX-X64, maybe you can use AVX for both, or AVX,AVX-X86 and AVX,AVX-X64 if they do have some differences. And remove CHECK since it never been used.

This revision is now accepted and ready to land.Oct 29 2020, 11:06 PM
LiuChen3 updated this revision to Diff 301825.Oct 30 2020, 12:39 AM
  1. Move the testcase from avx-vnni/ to test/CodeGen/X86/.
  2. Refine the Run line in avx_vnni-intrinsics.ll
pengfei accepted this revision.Oct 30 2020, 12:47 AM

LGTM. Thanks. Better to wait one day or two to see if others object.

LGTM. Thanks. Better to wait one day or two to see if others object.

Sure. Thanks for your review.

RKSimon accepted this revision.Oct 30 2020, 2:39 AM

LGTM cheers

craig.topper accepted this revision.Oct 30 2020, 11:33 AM

LGTM to me with a couple minors that can be fixed on commit.

clang/test/Preprocessor/predefined-arch-macros.c
1828

doesn't this NOT need to be above AVXNNI to be effective since that's where AVX512 would go alphabetically?

llvm/lib/Target/X86/MCTargetDesc/X86InstPrinterCommon.cpp
351

Put parentheses around the & parts. Similar to lock and notrack above. I don't want to think about operator precedence.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 30 2020, 10:18 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
LiuChen3 marked 2 inline comments as done.Oct 30 2020, 10:20 PM

Thanks for all of your review!

RKSimon added inline comments.Jan 24 2022, 2:51 PM
clang/lib/Headers/cpuid.h
199
craig.topper added inline comments.Jan 24 2022, 3:28 PM
clang/lib/Headers/cpuid.h
199

Maybe we should start writing these as 1 << 4. This is not the first shifted by one issue we've had in this file.

LiuChen3 added inline comments.Jan 24 2022, 10:16 PM
clang/lib/Headers/cpuid.h
199

It my fault. I have a fix here: https://reviews.llvm.org/D118103 .