Page MenuHomePhabricator

[SLP] Generalization of stores vectorization.
Needs ReviewPublic

Authored by ABataev on Feb 21 2018, 9:01 AM.

Details

Summary

Stores are vectorized with maximum vectorization factor of 16. Patch
tries to improve the situation and use maximal vectorization factor.

Diff Detail

Event Timeline

ABataev created this revision.Feb 21 2018, 9:01 AM
spatel added a comment.Mar 5 2018, 6:53 AM

Someone more familiar with SLP should have a look at the diffs, but we need to address the compile-time question.
The artificial limit is only there to guard against excessive compile-time cost, so do you have data to show that difference? Or does this patch solve the (potential) problem in another way?
More discussion here:
https://bugs.llvm.org/show_bug.cgi?id=17170

ABataev updated this revision to Diff 137030.Mar 5 2018, 9:49 AM

Reduced complexity of the stores vectorization.

lebedev.ri added inline comments.
lib/Transforms/Vectorize/SLPVectorizer.cpp
5336–5337

The std::max() is not a typo? Might warrant a comment.

ABataev updated this revision to Diff 137042.Mar 5 2018, 10:44 AM

Fixed typo max->min

Someone more familiar with SLP should have a look at the diffs, but we need to address the compile-time question.
The artificial limit is only there to guard against excessive compile-time cost, so do you have data to show that difference? Or does this patch solve the (potential) problem in another way?
More discussion here:
https://bugs.llvm.org/show_bug.cgi?id=17170

Just some theoretical thoughts. The complexity of the original implementation is O(n), while the complexity of the updated patch is O(n * log n);

lib/Transforms/Vectorize/SLPVectorizer.cpp
5336–5337

Yes, it is a typo, must be min, thanks!

lebedev.ri added inline comments.Mar 5 2018, 11:00 AM
lib/Transforms/Vectorize/SLPVectorizer.cpp
5336–5337

Then you probably want to add a test :)

ABataev updated this revision to Diff 137064.Mar 5 2018, 12:31 PM

Updated test checks for maximum depth length lookup.

ABataev updated this revision to Diff 218762.Sep 4 2019, 11:57 AM
ABataev marked an inline comment as done.

Rebase

Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2019, 11:57 AM

please can you rebase now that D29641 has landed

RKSimon added inline comments.Wed, Sep 25, 10:48 AM
test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll
99

This change is surprising, given how slow v2i64 ADD/SUB are on SLM

ABataev marked an inline comment as done.Wed, Sep 25, 11:08 AM
ABataev added inline comments.
test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll
99

I believe it is the problem of the cost model. Most probably, it will be changed with D42981.

Please can you rebase to see if my SLM cost changes fixes the saturated arithmetic changes?

test/Transforms/SLPVectorizer/X86/cast.ll
22 ↗(On Diff #221573)

Please check this - it looks superfluous (and is under an unused prefix)

ABataev updated this revision to Diff 221937.Thu, Sep 26, 6:50 AM

Rebase + fix for the test.

RKSimon added inline comments.Thu, Sep 26, 10:17 AM
test/Transforms/SLPVectorizer/X86/pr35497.ll
2

Why the duplicated -slp-vectorizer?

ABataev marked an inline comment as done.Thu, Sep 26, 10:35 AM
ABataev added inline comments.
test/Transforms/SLPVectorizer/X86/pr35497.ll
2

Need to run the pass twice to get the code in pr35497 fully vectorized.

ABataev updated this revision to Diff 224053.Wed, Oct 9, 7:27 AM

Rebase + added analysis of target register width.