This is an archive of the discontinued LLVM Phabricator instance.

[AVX512] Emit generic masked store intrinsics directly from clang instead of using x86 specific intrinsics.
ClosedPublic

Authored by craig.topper on May 29 2016, 2:48 PM.

Details

Summary

This will allow us to remove the x86 specific intrinsics from the backend.

Diff Detail

Event Timeline

craig.topper retitled this revision from to [AVX512] Emit generic masked store intrinsics directly from clang instead of using x86 specific intrinsics..
craig.topper updated this object.
craig.topper added a subscriber: cfe-commits.
delena added inline comments.May 29 2016, 10:39 PM
lib/CodeGen/CGBuiltin.cpp
6304

What code do you receive at the end? There is no shuffle instruction in the architecture for mask vector.

test/CodeGen/avx512f-builtins.c
123

I suggest to remove %5, %6 from the test, you can put something like this:
CHECK: store <16 x i32> {{.*}}, align 1

craig.topper added inline comments.May 29 2016, 10:45 PM
lib/CodeGen/CGBuiltin.cpp
6304

That's not really a shuffle. It's an extract subvector, but the IR doesn't have a real instruction for that.

It's needed so we can go from i8 -> v8i1 -> v2i1/v4i1.

test/CodeGen/avx512f-builtins.c
123

I'll clean that up. I fixed most of them but looks like a missed a few.

delena added inline comments.May 29 2016, 11:18 PM
lib/CodeGen/CGBuiltin.cpp
6304

I understand. I just wanted to be sure that you receive only one "kmov %edi, %k1" at the end.

craig.topper added inline comments.May 29 2016, 11:33 PM
lib/CodeGen/CGBuiltin.cpp
6304

Yes, only one "kmov %edi, %k1" was generated.

delena accepted this revision.May 29 2016, 11:36 PM
delena edited edge metadata.

LGTM, After tests cleanup

This revision is now accepted and ready to land.May 29 2016, 11:36 PM
craig.topper closed this revision.May 31 2016, 12:40 AM

Commited in r271246.