This is an archive of the discontinued LLVM Phabricator instance.

[x86][icelake]GFNI
ClosedPublic

Authored by coby on Nov 22 2017, 11:38 AM.

Details

Summary

galios field arithmetic (GF(2^8)) insns:
gf2p8affineinvqb
gf2p8affineqb
gf2p8mulb

Diff Detail

Repository
rL LLVM

Event Timeline

coby created this revision.Nov 22 2017, 11:38 AM
coby edited the summary of this revision. (Show Details)Nov 22 2017, 11:42 AM
craig.topper added inline comments.Nov 22 2017, 11:50 AM
lib/Support/Host.cpp
1479 ↗(On Diff #123993)

This needs to be rebased. The current code is here is ordered by bit position and nicely formatted. Please insert this bit in the correct position in the code

lib/Target/X86/X86ISelLowering.cpp
19568 ↗(On Diff #123993)

I have trouble believing you need a 2OP_MASKZ. Can't the caller pass the zero vector to the mask version like we do for everything else? The reason we have OP3_MASKZ is because the OP3_MASK uses the passthru input in both the select op and the operation. So we needed a separate intrinsic to define the zero semantics of the select independent of the operation.

19652 ↗(On Diff #123993)

Again I think you can pass the zero vector to the masked intrinsic when its called.

craig.topper added inline comments.Nov 22 2017, 11:54 AM
lib/Target/X86/X86InstrAVX512.td
9813 ↗(On Diff #123993)

This requires BWI in order to make v64i8 a legal type.

craig.topper added inline comments.Nov 22 2017, 12:05 PM
lib/Target/X86/X86ISelLowering.cpp
19568 ↗(On Diff #123993)

sorry that should have said FMA_MASKZ not OP3_MASKZ.

coby added inline comments.Nov 22 2017, 12:11 PM
lib/Target/X86/X86ISelLowering.cpp
19568 ↗(On Diff #123993)

interesting
is this logic applies to anywhere else a MASKZ variant is being used?
nevertheless you are correct, i'll remove the (self) added Z variants

coby added inline comments.Nov 22 2017, 12:14 PM
lib/Target/X86/X86InstrAVX512.td
9813 ↗(On Diff #123993)

just noted i've nonchalantly added to the test but omitted it here
thanks

craig.topper added inline comments.Nov 22 2017, 12:18 PM
include/llvm/IR/IntrinsicsX86.td
1356 ↗(On Diff #123993)

Please sync the GCCBuiltin names with gcc.

craig.topper added inline comments.Nov 22 2017, 12:21 PM
lib/Target/X86/X86InstrAVX512.td
9816 ↗(On Diff #123993)

Go ahead and add BWI here too. It may not be strictly necessary, but a lot of the masking support for bytes and words is dependent on it. It's also consistent with gcc.

coby added inline comments.Nov 22 2017, 12:26 PM
include/llvm/IR/IntrinsicsX86.td
1356 ↗(On Diff #123993)

basically does weren't yet introduced to gcc, so the closest we can get is Julia Kovel's GFNI respective ongoing patch(s), i.e. https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01043/0002-GF2P8AFFINEINVQB-instruction.patch

coby added inline comments.Nov 22 2017, 12:29 PM
include/llvm/IR/IntrinsicsX86.td
1356 ↗(On Diff #123993)

does ~~> those

craig.topper edited edge metadata.Nov 22 2017, 12:29 PM

k

include/llvm/IR/IntrinsicsX86.td
1356 ↗(On Diff #123993)

They look to have been committed to gcc on Nov. 16. svn revision 254795

coby added inline comments.Nov 22 2017, 12:41 PM
include/llvm/IR/IntrinsicsX86.td
1356 ↗(On Diff #123993)

Oh, I see. kyukhin.
i'll match against it, thanks

coby updated this revision to Diff 124047.EditedNov 23 2017, 1:55 AM

adopted changes proposed by Craig:
maskz variants are out in favor of explicit passing of the additive identity
BWI dependency is explicitly stated
GCC builtins counterparts are matched
+ nice formatting of feature recognition

craig.topper added inline comments.Nov 24 2017, 11:02 AM
include/llvm/IR/IntrinsicsX86.td
1347 ↗(On Diff #124047)

Can we use builtin_ia32_selectb_512 and builtin_ia32_vgf2p8affineinvqb_v16qi to implement these in the clang header instead of adding a separate masked intrinsic?

Its different than gcc, but consistent with other intrinsics in clang/llvm.

You should add __builtin_ia32_vgf2p8affineinvqb_v64qi without masking so that it can be consistent.

1407 ↗(On Diff #124047)

I'd prefer to see this named int_x86_gf2p8affineqb_128 and the 256-bit one named int_x86_gf2p8affineqb_256.

1423 ↗(On Diff #124047)

Name this _256 and the one above _128.

lib/Target/X86/X86.td
173 ↗(On Diff #124047)

Isn't it Galois not Galios

lib/Target/X86/X86ISelLowering.h
582 ↗(On Diff #124047)

Galois

lib/Target/X86/X86InstrFragmentsSIMD.td
659 ↗(On Diff #124047)

galois

lib/Target/X86/X86InstrSSE.td
8523 ↗(On Diff #124047)

NoVLX should be NoVLX_Or_NoBWI

coby added inline comments.Nov 25 2017, 1:54 PM
include/llvm/IR/IntrinsicsX86.td
1347 ↗(On Diff #124047)

certainly
so you propose introducing only non-masked variants on the llvm-side for gfni as a whole?

coby added inline comments.Nov 25 2017, 1:57 PM
lib/Target/X86/X86.td
173 ↗(On Diff #124047)

Evariste *Galois*, indeed.
thx

craig.topper added inline comments.Nov 25 2017, 2:33 PM
include/llvm/IR/IntrinsicsX86.td
1347 ↗(On Diff #124047)

Yep that's what I'm proposing.

coby updated this revision to Diff 124270.Nov 25 2017, 5:42 PM

addressed Craig's comments

craig.topper accepted this revision.Nov 26 2017, 12:14 AM

LGTM, but can you also add tests with select instructions for testing that merge masking and zero masking behavior either as part of this or as a follow up.

This revision is now accepted and ready to land.Nov 26 2017, 12:14 AM
This revision was automatically updated to reflect the committed changes.