This is an archive of the discontinued LLVM Phabricator instance.

[x86][icelake]GFNI
ClosedPublic

Authored by coby on Nov 22 2017, 11:38 AM.

Details

Summary

galios field arithmetic (GF(2^8)) insns:
gf2p8affineinvqb
gf2p8affineqb
gf2p8mulb

Diff Detail

Repository
rL LLVM

Event Timeline

coby created this revision.Nov 22 2017, 11:38 AM
coby edited the summary of this revision. (Show Details)Nov 22 2017, 11:42 AM
craig.topper added inline comments.Nov 22 2017, 11:50 AM
lib/Support/Host.cpp
1227

This needs to be rebased. The current code is here is ordered by bit position and nicely formatted. Please insert this bit in the correct position in the code

lib/Target/X86/X86ISelLowering.cpp
19814

I have trouble believing you need a 2OP_MASKZ. Can't the caller pass the zero vector to the mask version like we do for everything else? The reason we have OP3_MASKZ is because the OP3_MASK uses the passthru input in both the select op and the operation. So we needed a separate intrinsic to define the zero semantics of the select independent of the operation.

19901

Again I think you can pass the zero vector to the masked intrinsic when its called.

craig.topper added inline comments.Nov 22 2017, 11:54 AM
lib/Target/X86/X86InstrAVX512.td
10028

This requires BWI in order to make v64i8 a legal type.

craig.topper added inline comments.Nov 22 2017, 12:05 PM
lib/Target/X86/X86ISelLowering.cpp
19814

sorry that should have said FMA_MASKZ not OP3_MASKZ.

coby added inline comments.Nov 22 2017, 12:11 PM
lib/Target/X86/X86ISelLowering.cpp
19814

interesting
is this logic applies to anywhere else a MASKZ variant is being used?
nevertheless you are correct, i'll remove the (self) added Z variants

coby added inline comments.Nov 22 2017, 12:14 PM
lib/Target/X86/X86InstrAVX512.td
10028

just noted i've nonchalantly added to the test but omitted it here
thanks

craig.topper added inline comments.Nov 22 2017, 12:18 PM
include/llvm/IR/IntrinsicsX86.td
1348

Please sync the GCCBuiltin names with gcc.

craig.topper added inline comments.Nov 22 2017, 12:21 PM
lib/Target/X86/X86InstrAVX512.td
10031

Go ahead and add BWI here too. It may not be strictly necessary, but a lot of the masking support for bytes and words is dependent on it. It's also consistent with gcc.

coby added inline comments.Nov 22 2017, 12:26 PM
include/llvm/IR/IntrinsicsX86.td
1348

basically does weren't yet introduced to gcc, so the closest we can get is Julia Kovel's GFNI respective ongoing patch(s), i.e. https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01043/0002-GF2P8AFFINEINVQB-instruction.patch

coby added inline comments.Nov 22 2017, 12:29 PM
include/llvm/IR/IntrinsicsX86.td
1348

does ~~> those

craig.topper edited edge metadata.Nov 22 2017, 12:29 PM

k

include/llvm/IR/IntrinsicsX86.td
1348

They look to have been committed to gcc on Nov. 16. svn revision 254795

coby added inline comments.Nov 22 2017, 12:41 PM
include/llvm/IR/IntrinsicsX86.td
1348

Oh, I see. kyukhin.
i'll match against it, thanks

coby updated this revision to Diff 124047.EditedNov 23 2017, 1:55 AM

adopted changes proposed by Craig:
maskz variants are out in favor of explicit passing of the additive identity
BWI dependency is explicitly stated
GCC builtins counterparts are matched
+ nice formatting of feature recognition

craig.topper added inline comments.Nov 24 2017, 11:02 AM
include/llvm/IR/IntrinsicsX86.td
1347

Can we use builtin_ia32_selectb_512 and builtin_ia32_vgf2p8affineinvqb_v16qi to implement these in the clang header instead of adding a separate masked intrinsic?

Its different than gcc, but consistent with other intrinsics in clang/llvm.

You should add __builtin_ia32_vgf2p8affineinvqb_v64qi without masking so that it can be consistent.

1407

I'd prefer to see this named int_x86_gf2p8affineqb_128 and the 256-bit one named int_x86_gf2p8affineqb_256.

1423

Name this _256 and the one above _128.

lib/Target/X86/X86.td
173

Isn't it Galois not Galios

lib/Target/X86/X86ISelLowering.h
590

Galois

lib/Target/X86/X86InstrFragmentsSIMD.td
675

galois

lib/Target/X86/X86InstrSSE.td
8521

NoVLX should be NoVLX_Or_NoBWI

coby added inline comments.Nov 25 2017, 1:54 PM
include/llvm/IR/IntrinsicsX86.td
1347

certainly
so you propose introducing only non-masked variants on the llvm-side for gfni as a whole?

coby added inline comments.Nov 25 2017, 1:57 PM
lib/Target/X86/X86.td
173

Evariste *Galois*, indeed.
thx

craig.topper added inline comments.Nov 25 2017, 2:33 PM
include/llvm/IR/IntrinsicsX86.td
1347

Yep that's what I'm proposing.

coby updated this revision to Diff 124270.Nov 25 2017, 5:42 PM

addressed Craig's comments

craig.topper accepted this revision.Nov 26 2017, 12:14 AM

LGTM, but can you also add tests with select instructions for testing that merge masking and zero masking behavior either as part of this or as a follow up.

This revision is now accepted and ready to land.Nov 26 2017, 12:14 AM
This revision was automatically updated to reflect the committed changes.