This is an archive of the discontinued LLVM Phabricator instance.

[LV][X86] Support of AVX2 Gathers code generation and update the LV with this
ClosedPublic

Authored by magabari on Jul 23 2017, 3:55 AM.

Details

Summary

This patch depends on: https://reviews.llvm.org/D35348

Support of pattern selection of masked gathers on AVX2 (in X86\AVX2 code gen)
Update LoopVectorize to generate gathers for AVX2 processors.

Diff Detail

Event Timeline

magabari created this revision.Jul 23 2017, 3:55 AM
RKSimon added inline comments.Jul 23 2017, 8:04 AM
../llvm/lib/Analysis/TargetTransformInfo.cpp
158 ↗(On Diff #107816)

Why on earth was this calling TTIImpl->isLegalMaskedGather? Make this change separately since it affects all targets not just X86?

../llvm/lib/Target/X86/X86ISelLowering.cpp
893 ↗(On Diff #107816)

This should be later on with the other AVX2 settings.

23770 ↗(On Diff #107816)

MSCATTER isn't supported on AVX2 - update the assertion comment, I don't think it needs to refer to MSCATTER.

23827 ↗(On Diff #107816)

Drop the innermost brackets around (Subtarget.hasAVX2())

23843 ↗(On Diff #107816)

Src0?

23860 ↗(On Diff #107816)

Drop the innermost brackets around (Subtarget.hasAVX2())

23875 ↗(On Diff #107816)

Src0?

../llvm/lib/Target/X86/X86TargetTransformInfo.cpp
2217 ↗(On Diff #107816)

AVX-512 & SKL client with AVX2 allows gather

2219 ↗(On Diff #107816)

(ST->hasAVX2()) --> ST->hasAVX2()

../llvm/test/CodeGen/X86/avx2_masked_gather.ll
2 ↗(On Diff #107816)

You should be able to use utils/update_llc_test_checks.py to autogenerate the codegen.

Usual convention is to use X86 and X64 for the prefixes.

Why are you needing to pipe stderr?

4 ↗(On Diff #107816)

Remove the triple/datalayout lines - you should be driving this from the -mtriples in the command line

magabari marked 11 inline comments as done.Jul 24 2017, 1:38 AM

fixed simon notes

magabari updated this revision to Diff 107858.Jul 24 2017, 1:40 AM
RKSimon added inline comments.Jul 31 2017, 10:38 AM
lib/Target/X86/X86ISelLowering.cpp
23

clang-format?

lib/Target/X86/X86TargetTransformInfo.cpp
7

clang-format?

8

clang-format?

test/CodeGen/X86/avx2_masked_gather.ll
3 ↗(On Diff #107858)

Please can you submit this test file with the current trunk codegen and update this patch to show the diff?

29 ↗(On Diff #107858)

It might be useful to either update this test to return <4x i32> (concat the v2i32 gather result with v2i32 undef) or add a separate test case that does this. Same for the <2 x float> tests. What do you think?

RKSimon added inline comments.Jul 31 2017, 10:42 AM
test/CodeGen/X86/avx2_masked_gather.ll
3 ↗(On Diff #107858)

And to match existing test files please can you rename it avx2-masked-gather.ll?

RKSimon edited edge metadata.Sep 8 2017, 6:34 AM

@magabari What is happening with this?

@magabari What is happening with this?

will update it soon ..

magabari updated this revision to Diff 115618.Sep 18 2017, 3:48 AM
magabari marked 6 inline comments as done.
delena added inline comments.Oct 29 2017, 6:45 AM
lib/Target/X86/X86ISelLowering.cpp
23915

if the subtarget has VLX it always has AVX2. Gather is not supported under AVX2..

23933

clang-format

23966

clang-format

magabari marked 3 inline comments as done.Oct 31 2017, 12:32 AM

fixed Elena comments

magabari updated this revision to Diff 120943.Oct 31 2017, 12:34 AM
delena added inline comments.Nov 1 2017, 2:40 AM
lib/Target/X86/X86ISelLowering.cpp
1014
if (Subtarget.hasAVX2())
  setOperationAction(ISD::MGATHER, MVT::v2i64, Custom);
23857–23858

512 includes AVX2

23929–23936
if (Subtarget.hasVLX())
  Mask = ExtendToType(Mask, MVT::v4i1, DAG, false);
else
  Mask =  DAG.getVectorShuffle(MVT::v4i32, dl, DAG.getBitcast(MVT::v4i32, Mask),
                                                    DAG.getUNDEF(MVT::v4i32), {0, 2, -1, -1});
23947–23948

VLX contains AVX2

23961–23968

I fixed the code above.

magabari marked 5 inline comments as done.Nov 1 2017, 5:46 AM
magabari updated this revision to Diff 121120.Nov 1 2017, 5:55 AM
delena added inline comments.Nov 1 2017, 10:39 AM
lib/Target/X86/X86ISelLowering.cpp
23964

I think that this transformation of mask is wrong for V2F32. unlike integer, it was widened to v4f32 and the mask should be correct.

lib/Target/X86/X86TargetTransformInfo.cpp
2461

this "if" is not clear. Please add appropriate comments and describe what you are going to check.

2514

512 includes avx2

magabari updated this revision to Diff 121618.Nov 5 2017, 12:52 AM
magabari marked 2 inline comments as done.
delena accepted this revision.Nov 5 2017, 2:03 AM
This revision is now accepted and ready to land.Nov 5 2017, 2:03 AM
magabari updated this revision to Diff 123013.Nov 15 2017, 4:59 AM

updating masked-intrinsic-cost.ll with the new costs of AVX2 and adding SKL check

magabari updated this revision to Diff 123157.Nov 16 2017, 4:39 AM

gathers will be allowed for Skylake currently

delena accepted this revision.Nov 19 2017, 1:29 PM
This revision was automatically updated to reflect the committed changes.