This is an archive of the discontinued LLVM Phabricator instance.

[LV] Make sure VF doesn't exceed compile time known TC
ClosedPublic

Authored by ebrevnov on Nov 24 2021, 5:08 AM.

Details

Summary

For the simple copy loop (see test case) vectorizer selects VF equal to 32 while the loop is known to have 17 iterations only. Such behavior makes no sense to me since such vector loop will never be executed. The only case we may want to select VF large than TC is masked vectoriztion. So I haven't touched that case.

Diff Detail

Event Timeline

ebrevnov created this revision.Nov 24 2021, 5:08 AM
ebrevnov requested review of this revision.Nov 24 2021, 5:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2021, 5:08 AM
ebrevnov edited the summary of this revision. (Show Details)Nov 24 2021, 5:22 AM
ebrevnov added reviewers: fhahn, anna, Ayal.

Please note we still vectorize epilog with VF 8 while the remainder number of iterations is known to be 1. That should be addressed separately though..

Sounds sensible to me.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1747

foldTailByMasking -> FoldTailByMasking

6076–6077

Can you update this comment..

6085

.. and this debug message.

llvm/test/Transforms/LoopVectorize/X86/known_TC.ll
6–7

This can be removed now?

ebrevnov marked 4 inline comments as done.Nov 25 2021, 9:36 PM
ebrevnov added inline comments.
llvm/test/Transforms/LoopVectorize/X86/known_TC.ll
6–7

Yep, missed that...

ebrevnov updated this revision to Diff 389902.Nov 25 2021, 9:42 PM
ebrevnov marked an inline comment as done.

Updated

dmgreen accepted this revision.Dec 3 2021, 2:21 AM

LGTM then, if there are not more comments. Thanks.

This revision is now accepted and ready to land.Dec 3 2021, 2:21 AM
fhahn added inline comments.Dec 3 2021, 2:53 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6085

Shouldn't this be PowerOf2Floor(ConstTripCount)?

dmgreen added inline comments.Dec 3 2021, 3:05 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6085

I think this is saying "the trip count is "ConstTripCount", so we are rounding that down to a power of 2 (left as an exercise to the reader)."

It may be better to be clearer about the values it will use though. Having what it thinks the original ConstTripCount printed somewhere does sound useful.

ebrevnov added inline comments.Dec 3 2021, 3:46 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6085

I think this is saying "the trip count is "ConstTripCount", so we are rounding that down to a power of 2 (left as an exercise to the reader)."

While I agree this is valid interpretation I did mean to provide resulting value. So the printing PowerOf2Floor(ConstTripCount) here sounds reasonable to me. Do you think we should change wording?

It may be better to be clearer about the values it will use though. Having what it thinks the original ConstTripCount printed somewhere does sound useful.

It is printed out at the beginning of computeFeasibleMaxVF.

fhahn added inline comments.Dec 10 2021, 1:43 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6085

While I agree this is valid interpretation I did mean to provide resulting value. So the printing PowerOf2Floor(ConstTripCount) here sounds reasonable to me. Do you think we should change wording?

I think the wording is fine when printing ClampedConstTripCount. As you said, the original trip count is printed earlier. It could also be included in the message (known trip count: ...) or something similar, but I am not sure if it adds a lot of additional info.

fhahn added inline comments.Dec 13 2021, 2:15 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6085

IIUC there was some consensus to use ClampedConstTripCount in the message and maybe include ConstTripCount in addition to that. But it looks like the committed version has not been updated accordingly and just prints the ConstTripCount. Did I miss anything?

ebrevnov added inline comments.Dec 13 2021, 3:15 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6085

You didn't... that was my overlook.. will fix shortly. Thanks!