Page MenuHomePhabricator

[LV] Mark instructions with loop invariant arguments as uniform. (WIP)
Changes PlannedPublic

Authored by fhahn on Oct 10 2019, 1:41 PM.

Details

Summary

As suggested by Ayal in D59995, we can mark instructions with
loop invariant arguments as uniform. They will always produce
the same result.

Now that we can have more uniform instructions, there were some
assertions that needed relaxing a bit.

Also, there still seems to be an issue with constant folding in LV not
being able to simplify some uniform values compared to their replicated
equivalents. I still have to look into that, but I wanted to make sure
the overall approach aligns well.

The overall impact of the change is probably quite low, but at least in
the test-suite, there are around 4 benchmarks were we ended up
vectorizing a few more loops.

Currently we still miss some uniform instructions, that only have
uniform operands, but that can be addressed as follow-up.

Event Timeline

fhahn created this revision.Oct 10 2019, 1:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2019, 1:41 PM
Ayal added a comment.Nov 2 2019, 4:35 PM

Thanks for coming back to look into this @fhahn !

Overall, wonder what the current forward-propagating invariance analysis is missing, and if it's better to keep it separate from the backward-propagating "DemandedLanes" analysis aka uniform-after-vectorization. More specific comments inline.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
2065

There's a distinction between (1) the truly "uniform values" of Legal->isUniform(), and (2) those of Cost->isUniformAfterVectorization() that "become" uniform. In both cases generating the single scalar value of lane zero suffices; the values of all other lanes need not be generated, because (1) they are all equal to that of lane zero, or (2) they are all dead. I.e., DemandedLanes={0} in case (2).

Instructions of case (1) should have ideally been LICM'd out, except for conditional instructions that have side-effects. Case (2) is what this assert is trying to verify - a user requesting the value of some lane>0 contradicts the liveness assumption; feeding it with the value of lane 0 may be feeding it a different, wrong value.

Note that this change to getOrCreateScalarValue() *alone* "fixes" PR40816, but there the users requesting the values of lanes>0 are essentially dead. Would be good to devise a test involving a predicated uniform-after-vectorization instruction, that is essentially live.

Would be good to come up with a better name than UniformAfterVectorization, which looks up a set named "Uniforms". Suggestions?

4703

Such instructions should be identified as Legal->isUniform(), right?

Note that in general loop invariance is stronger than uniformity, though Legal->isUniform() currently does return isLoopInvariant().

4713

While we're here, _or_null should be dropped, &I cannot possibly be null.

5485

Predicated instructions must not be uniform-after-vectorization currently, and this should better be checked earlier to avoid inserting them into Uniforms, or bailout from vectorizing, to prevent crashing as in PR40816. This assert doesn't trigger there because it's masked under UseEmulatedMaskedMemRefHack().

When predicated instructions are allowed to be uniform-after-vectorization, it would be good for VPlan to reflect it, and have the VPRegionBlock built for it generate only a single replica (per part?) - that of lane zero.

5533

Comments and code above should be updated?

fhahn planned changes to this revision.Fri, Nov 29, 6:06 AM

Thanks for taking a look Ayal! I'll update the patch once D70298 landed, to avoid causing unnecessary rebasing.