This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Try doubled MaxElts for stores vectorization
AbandonedPublic

Authored by anton-afanasyev on Jan 19 2021, 8:19 AM.

Details

Summary

Try to use 2 * MaxElts size of vectors for stores vectorization. This commit
is motivated by effect of bugfixing at reviews.llvm.org/D93192 and tries
to compensate it.
There could be the case, for instance, when cost of pair of <4 x float>
vectorization is zero, but vectorization of <8 x float> is beneficial however.
LLVM vector with 2 * MaxElts cannot be lowered to one register, of course, it is splitted
to two registers.
We try to check 2 * MaxElts after MaxElts not to interfere the ordinary vectorization
which could be accepted as beneficial itself.

Diff Detail

Event Timeline

anton-afanasyev requested review of this revision.Jan 19 2021, 8:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 19 2021, 8:19 AM

Fixed comment containing this D94974 revision number

RKSimon added inline comments.Jan 19 2021, 8:22 AM
llvm/test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll
174

please can you cleanup all these checks ?

Harbormaster completed remote builds in B85723: Diff 317579.

Small test fix

anton-afanasyev marked an inline comment as done.Jan 27 2021, 9:22 AM
anton-afanasyev added inline comments.
llvm/test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll
174

Fixed this line in test or did you mean to precommit check prefixes?

RKSimon added inline comments.Jan 27 2021, 10:12 AM
llvm/test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll
174

We seem to have AVX and AVX1 check prefixes now - go back and replace the check-prefixes=AVX with check-prefixes=AVX1 (not sure if we can have a common AVX for AVX1 + AVX2)?

anton-afanasyev marked an inline comment as done.

Clean up tests

anton-afanasyev marked an inline comment as done.Jan 27 2021, 10:34 AM
anton-afanasyev added inline comments.
llvm/test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll
174

Oh, I see. Done.
(no, we can't use common AVX for AVX1 + AVX2 in that case).

RKSimon added inline comments.Jan 27 2021, 10:45 AM
llvm/test/Transforms/SLPVectorizer/X86/arith-mul.ll
5

--check-prefixes=AVX_PREFER128,AVX1_PREFER128

7

--check-prefixes=AVX_PREFER128,AVX2_PREFER128

183

We're setting prefer-128-bit and yet still generating <4 x i64> ops?

anton-afanasyev added inline comments.Feb 2 2021, 7:20 AM
llvm/test/Transforms/SLPVectorizer/X86/arith-mul.ll
183

Hmm, yes, you're right, that's strange to generate <4 x i64> for the case with preferable width (=128). But we can't check this at the abstract llvm level. Generally we don't know the target constraints, so this my patch looks too tricky for such cases.

Due to what said above, I'm to abandon this change. It looks like over-optimization, breaking llvm IR middle-end abstraction.

anton-afanasyev abandoned this revision.Feb 5 2021, 5:43 AM