This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix vectorization for tree with trunc to minimum required bit width.
ClosedPublic

Authored by ABataev on Jan 11 2018, 6:46 AM.

Details

Summary

If the vectorized tree has truncate to minimum required bit width and
the vector type of the cast operation after the truncation is the same
as the vector type of the cast operands, count cost of the vector cast
operation as 0, because this cast will be later removed.
Also, if the vectorization tree root operations are integer cast operations, do not consider them as candidates for truncation. It will just create extra number of the same vector/scalar operations, which will be removed by instcombiner.

Diff Detail

Repository
rL LLVM

Event Timeline

ABataev created this revision.Jan 11 2018, 6:46 AM

Thanks for cleaning this up.

lib/Transforms/Vectorize/SLPVectorizer.cpp
2061–2064 ↗(On Diff #129443)

Makes sense.

4023–4025 ↗(On Diff #129443)

What do you mean by "top" here? Do you just mean that the initial Root should not be a cast? If so, would it be easier to check that here, rather than passing the flag in collectValuesToDemote?

ABataev added inline comments.Jan 15 2018, 6:38 AM
lib/Transforms/Vectorize/SLPVectorizer.cpp
4023–4025 ↗(On Diff #129443)

Yes, it is about initial roots. The main problem here is that these roots still must be analyzeв though not considered as candidates to demote.
Meanwhile, I have a question for you. Why you try to generate sequence cast ( extractelement(<resulting_vector>)) instead of extractelement(cast(<resulting_vector>))? It produces 2 * n instructions, while the second approach will produce n + 1 instructions.

ABataev updated this revision to Diff 130484.Jan 18 2018, 1:22 PM

Update after review

mssimpso added inline comments.Jan 18 2018, 1:36 PM
lib/Transforms/Vectorize/SLPVectorizer.cpp
4023–4025 ↗(On Diff #129443)

I honestly don't recall what the reason was. It probably had something to do with the way InstCombine wanted to rewrite the expression, given the trunc/ext sequence.

I think what we should really try to do is make the type truncating code in InstCombine (EvaluateInDifferentType) a utility that we can use within the vectorizers. The trunc/ext trick is very fragile. We use it here in SLP,and in two separate places in LV to type-shrink reductions and other instructions.

I'll finish taking a look at the rest of the patch shortly.

mssimpso accepted this revision.Jan 18 2018, 1:43 PM

LGTM.

This revision is now accepted and ready to land.Jan 18 2018, 1:43 PM
This revision was automatically updated to reflect the committed changes.