This is an archive of the discontinued LLVM Phabricator instance.

[LV][nearly NFC] Move isLegalMasked* functions from Legality to CostModel
ClosedPublic

Authored by hsaito on Feb 12 2018, 1:15 PM.

Details

Summary

Please see http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for RFC and the discussions.

All SIMD architectures can emulate masked load/store/gather/scatter through element-wise condition check, scalar load/store, and insert/extract. Therefore, bailing out of vectorization as legality failure, when they return false, is incorrect. We should proceed to cost model and determine profitability.

This patch is to address the vectorizer's architectural limitation described above. As such, I tried to keep the cost model and vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning should be done separately.

Diff Detail

Event Timeline

hsaito created this revision.Feb 12 2018, 1:15 PM
hsaito added inline comments.Feb 12 2018, 2:00 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
2042–2047

dyn_cast<> abuse fixed. Code simplified a bit. Should be NFC.

6202

Do we even want to keep this bail out?

test/Transforms/LoopVectorize/conditional-assignment.ll
4

I think the remark emission needs to be fixed so that it includes "loop not vectorized:". If so, I'll do that in a separate patch.

rengolin added inline comments.Feb 23 2018, 7:35 AM
lib/Transforms/Vectorize/LoopVectorize.cpp
6202

I think this is done more as diagnosis than bail out.

6586

What if this was a hook into the TTI, it could use the targets' information to get a slightly more accurate cost estimate (ie, how hacky is the emulation, or even if one is available through its back-end)

6588

high / high

hsaito added inline comments.Feb 23 2018, 11:36 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
6202

In my opinion, this is a bail out in the sense that we blindly set VF=1 and say "go scalar". What's special about conditional stores compared to other "less than optimal" vector operations? We don't have such "enable" flags for every single one of them. Do we still consider the code to support conditional stores semi-"experimental"?

I just wanted to point that out. If anyone wants to keep this, I won't insist.

6586

Thanks. That's actually my preferred solution (http://lists.llvm.org/pipermail/llvm-dev/2018-January/120271.html) and Hal Finkel agrees (http://lists.llvm.org/pipermail/llvm-dev/2018-January/120273.html). If there are others feeling the same way (or opposite way), please add your voice here.

Having said that, is it okay for me to take the TTI approach as a separate patch submission? I'm sure TTI discussion will involve more stakeholders and the bulk of that discussion is tangent from fixing vectorizer's architectural issue related to isLegalMasked* this patch is trying to address.

This is especially so with the hacky NumberOfStoresToPredicate getting exposed outside of LoopVectorize code. I could imagine some reviewers would be asking us vectorizer developers to "tune the cost first so that the hacky NumberOfStoresToPredicate doesn't have to be leaked out of LoopVectorize". I don't think we should wait fixing the architectural problem until tuning is done.

6588

Will fix. Thanks.

rengolin added inline comments.Feb 24 2018, 12:06 AM
lib/Transforms/Vectorize/LoopVectorize.cpp
6202

Can you measure any difference? Like, running with and without on the test-suite to see if there are any hits that the new cost model is not picking it up?

6586

The way we've done in the past was to add the function we expect in TTI and implement it in the base class with a big FIXME, explaining the problem. In this way, targets can "FIXIT" in different ways at different times.

But I'm not against the idea of this hack, if that's the consensus (just had't followed the original thread :).

hsaito added inline comments.Feb 24 2018, 1:56 AM
lib/Transforms/Vectorize/LoopVectorize.cpp
6202

EnableCondStoresVectorization is true by default. So, this IF is no-op unless someone is explicitly setting it to false.

There are a few LIT tests checking for letting at least one emulated masked store vectorized.

I saw lots of perf regressions when I ran tests prior to adding cost model hooks (i.e., same cost as the first emulated masked store). On the opposite side of spectrum, if I remember correctly, I didn't see any of our perf tests regressing when I had all emulated masked load/store super high cost ---- that is a very easy way to get the code bit-rottened and thus I won't recommend it.

6586

You are the third person saying that the cost hack should be ideally coming from TTI (the other two are Hal and myself). No other opinions explicitly mentioned yet.

The basic idea of the hack (whether we implement in LoopVectorize or in TTI) is inevitable to make this patch "almost NFC". The TTI version I'm thinking about would have the same basic idea (i.e., return low cost for the first <NumberOfStoresToPredicate> emulated masked stores and super high cost for the rest) ---- because that's the essence of the code prior to this patch.

For those who are reviewing the TTI version, it would be a lot easier if the patch does not include isLegalMasked* change (from Legality to CostModel). That's another reason why I'd like to do it as a separate patch.

hsaito updated this revision to Diff 135798.Feb 24 2018, 9:40 AM

high high ==> high

hsaito marked an inline comment as done.Feb 24 2018, 9:41 AM
rengolin accepted this revision.Feb 25 2018, 5:50 AM

LGTM, thanks!

lib/Transforms/Vectorize/LoopVectorize.cpp
6202

I agree, better to drop this check, then.

6586

Makes sense.

This revision is now accepted and ready to land.Feb 25 2018, 5:50 AM

LGTM, thanks!

Thanks a lot. I do not have a commit-after-review right. Would be great if you'd commit this for me.

r326079, thanks!

rengolin closed this revision.Feb 26 2018, 3:09 AM

r326079, thanks!

Thanks!