This is an archive of the discontinued LLVM Phabricator instance.

[LV] Scalarize instructions marked scalar after vectorization
ClosedPublic

Authored by mssimpso on Aug 25 2016, 12:15 PM.

Details

Summary

This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized.

Diff Detail

Event Timeline

mssimpso updated this revision to Diff 69280.Aug 25 2016, 12:15 PM
mssimpso retitled this revision from to [LV] Scalarize instructions marked scalar after vectorization.
mssimpso updated this object.
mssimpso added reviewers: samparker, anemet, mkuper, wmi.
mssimpso added subscribers: llvm-commits, mcrosier.
mkuper edited edge metadata.Aug 25 2016, 1:29 PM

Would it make sense to commit the different changes here separately?

That is:
Patch 1: collectLoopUniforms()
Patch 2: The parts that only emit instructions for the low lane of uniforms.
Patch 3: vectorizeBlockInLoop()

If I'm not mistaken both (1) and (2) should be good independently. (1) because it will make other things that care about uniformity slightly more correct, and (2) because it may help some instructions we already scalarize today (although I'm not sure we actually have such cases now. So maybe 2+3 should go in together).

lib/Transforms/Vectorize/LoopVectorize.cpp
2263

Why "scalar"?

2265

Why not

return (!OrigLoop->contains(I) || !Legal->isScalarAfterVectorization(I)
        || Legal->isUniformAfterVectorization(I))

?

2278

Why do we need this?
Wouldn't we already mark EntryVal as uniform in collectLoopUniforms(), if all its users are uniform / consecutive pointers?

4565

This scares me a bit. :-)
As I said on the PR, I definitely think we should do it, but it still scares me. Perhaps add a bit more testing?

Regardless, there's (at least?) one more special case I see below - DbgInfoIntrinsic.

hfinkel added inline comments.
lib/Transforms/Vectorize/LoopVectorize.cpp
4565

As I said on the PR, I definitely think we should do it, but it still scares me. Perhaps add a bit more testing?

Which PR?

mkuper added inline comments.Aug 25 2016, 1:40 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
4565

My bad, not a PR, meant "on the other review". :-)
https://reviews.llvm.org/D23509

Would it make sense to commit the different changes here separately?

That is:
Patch 1: collectLoopUniforms()
Patch 2: The parts that only emit instructions for the low lane of uniforms.
Patch 3: vectorizeBlockInLoop()

If I'm not mistaken both (1) and (2) should be good independently. (1) because it will make other things that care about uniformity slightly more correct, and (2) because it may help some instructions we already scalarize today (although I'm not sure we actually have such cases now. So maybe 2+3 should go in together).

Right, that's what I was thinking. I wanted to included everything here in the review (at least initially) for context. We can split it up exactly as you suggested. Regarding an independent patch 2, it should be NFC post-instcombine. Pre-instcombine, we would at least no longer generate unnecessary steps for uniform IVs.

lib/Transforms/Vectorize/LoopVectorize.cpp
2263

This probably makes more sense with the response below. We're checking to see if all the scalar users of the IV we are scalarizing are also uniform. We can scalarize an IV if it has both vector and scalar users (ending up with two versions).

2265

That should work!

2278

This is for the cases in which we both want a vector and a scalar version of an IV. We scalarize an IV that's not marked scalar after vectorization when it has at least one scalar user. For such an IV, if all it's scalar users are also uniform, we will only ever need the first lane of the scalar IV.

4565

Right! Thanks for pointing out the debug intrinsics. Perhaps it would make better sense to handle all of the cases individually, instead all together at the top?

mkuper added inline comments.Aug 25 2016, 4:21 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
2263

Oh, ok, I understand now - this checks "isScalar() implies isUniform()" by computing "!isScalar() || isUniform()".
I read "scalar users" as "pre-vectorization users" not "users that are going to stay scalar", and got confused.

2278

Ah, got it, thanks!

Sentences like "all its scalar users are uniform" still confuse me, because of the terminology we've adopted.
My personal preference is still "uniform" for what we call "uniform scalar" and "multi-scalar" for what we call "non-uniform scalar" - then we could say that we're looking for an IV that has no multi-scalar users. But maybe I just need to get used to the current state. :-)

4565

Hmpf. I think I still prefer handling this at the top - duplicating this check into each case seems silly.

Thanks for the quick feedback, Michael! I'll start addressing your comments by first pulling out the changes to collectLoopUniforms in a separate review, and adding some specific tests for it.

lib/Transforms/Vectorize/LoopVectorize.cpp
2278

I agree the terminology is a bit confusing. At the very least, I'll add a more detailed comment here.

4565

Agreed.

mssimpso updated this revision to Diff 70437.Sep 6 2016, 11:26 AM
mssimpso updated this object.
mssimpso edited edge metadata.
mssimpso marked 3 inline comments as done.

I pulled out the changes to collectLoopUniforms (Patch 1) in D24271 and rebased. I also addressed Michael's comments on this existing patch. Next, I'll pull out the changes to uniform instruction emission (Patch 2) in a separate review.

mssimpso updated this revision to Diff 70456.Sep 6 2016, 12:51 PM
mssimpso updated this object.

I pulled out the changes to uniform instruction emission (Patch 2) in D24275 and rebased.

mssimpso updated this revision to Diff 72099.Sep 21 2016, 1:17 PM

Added a simplified version of Sam's original test case.

mkuper accepted this revision.Sep 25 2016, 5:21 AM
mkuper edited edge metadata.

LGTM

This revision is now accepted and ready to land.Sep 25 2016, 5:21 AM

Thanks, Michael!

This revision was automatically updated to reflect the committed changes.