This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Changes to implement VPlan based predication for VPlan-native path.
ClosedPublic

Authored by sguggill on Oct 16 2018, 4:21 PM.

Details

Summary

Context: Patch Series #2 for outer loop vectorization support in LV using VPlan.
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).

Patch series #2 checks that inner loops are still trivially lock-step among all vector elements. Non-loop branches are blindly assumed as divergent.

Changes here implement VPlan based predication algorithm to compute predicates for blocks that need predication. Predicates are computed for the VPLoop region in reverse post order. A block's predicate is computed as OR of the masks of all incoming edges. The mask for an incoming edge is computed as AND of predecessor block's predicate and either predecessor's Condition bit or NOT(Condition bit) depending on whether the edge from predecessor block to the current block is true or false edge.

Diff Detail

Event Timeline

sguggill created this revision.Oct 16 2018, 4:21 PM
sguggill added inline comments.Oct 16 2018, 4:25 PM
lib/Transforms/Vectorize/VPlanPredicator.cpp
267

Predicator is currently computing the dominator information for the top region. Once we start storing dominator information in a VPRegionBlock, we can avoid this recalculation.

fhahn added a comment.Oct 16 2018, 6:45 PM

Just a few small initial comments. I'll have a closer look in the following days.

lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
26

Would it be possible to avoid adding a EnableVPlanPredication option? IIUC, we would always like to run predication, if it is required?

I think it would be good to avoid adding too many options, to ensure we get good test coverage for all parts. Or make the default true.

lib/Transforms/Vectorize/VPlan.h
1520

nit: I think LLVM style would be if (!FromLoop || !ToLoop || FromLoop != ToLoop)

Also, not braces for blocks with a single statement.

1523

i think spelling it out would be clearer, e.g. a back-edge is a branch from the loop latch to its header.

1524

outermost braces not needed?

lib/Transforms/Vectorize/VPlanPredicator.cpp
30

personally I think it would be slightly clearer to have this function return a pair<int, bool>.

Also, the only state from VPlanPredicator used seems to be VPLI. Maybe this could just be a static non-member function?

32

would unsigned make more sense for cnt here? It should always be positive, right?

75

could be

if (BP)
 return Builder.createAnd(BP, IntermediateVal);;

return IntermediateVal;

No need for a FinalVal variable.

148

no return after unreachable?

185

nit: no braces for then block.

192

nit: no braces for else block.

217

nit: no braces .

267

I think it would be worth calling this out as a FIXME comment, as this is definitely something we should address sooner rather than later.

lib/Transforms/Vectorize/VPlanValue.h
42 ↗(On Diff #169918)

I think I missed which private things are used in VPlanPredicator?

sguggill added inline comments.Oct 19 2018, 2:00 PM
lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
26

Yes, I will remove this switch once we have divergence analysis. Right now when predication is enabled, we suppress the check for trivially uniform non-backedge branches and assume they are divergent. Without this check, the CG tests added testing uniform control flow will fail. I understand the concern about adding too many options and will add a FIXME comment about the same.

sguggill marked 12 inline comments as done.Oct 19 2018, 2:49 PM
sguggill added inline comments.
lib/Transforms/Vectorize/VPlanPredicator.cpp
30

We are not using HasBE currently. I have removed it for now. I also moved this function to VPBlockUtils.

lib/Transforms/Vectorize/VPlanValue.h
42 ↗(On Diff #169918)

Yes, this is not needed right now. I have removed it. Thanks for catching this.

sguggill updated this revision to Diff 170285.Oct 19 2018, 5:34 PM
sguggill marked 2 inline comments as done.

Address Florian's review comments.

As a starting point, this sure looks good to me. Thanks!

lib/Transforms/Vectorize/LoopVectorize.cpp
7141

Do you have an RFC or review on the code generation part? It would be good to have an idea soon enough.

lib/Transforms/Vectorize/VPlanPredicator.cpp
199

Shouldn't this be more like an assert, as if you don't handle nested regions, you can't get the full predication for the dominators and codegen will fail, no?

sguggill updated this revision to Diff 177981.Dec 12 2018, 5:42 PM

Address Renato's comments.

sguggill marked 2 inline comments as done.Dec 12 2018, 5:46 PM
sguggill added inline comments.
lib/Transforms/Vectorize/LoopVectorize.cpp
7141

I have the code generation changes working in my private workspace. I need to tie up some loose ends before I start the review. These changes make use of changes in D53355 Introducing VPPHINode. I will be out the next two weeks and I will try to get the CG review started once I get back.

Right, I'm happy with the patch and will let @fhahn do the final review and approval. Thanks!

Right, I'm happy with the patch and will let @fhahn do the final review and approval. Thanks!

Thanks Renato!

fhahn added inline comments.Dec 13 2018, 2:19 PM
lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
26

Right, that makes sense. Maybe we could use bugzilla to keep track of those things? Otherwise we might forget. Could you file an issue?

493–496

IMO the comment would be easier to parse if we keep the original comment and add an additional sentence like: FIXME: Currently those checks are disabled when VPlan based predication is disabled.

lib/Transforms/Vectorize/VPlan.h
534

IIUC it is the responsibility of the caller to remove the block from a predecessor's successor list? Does the verifier check that successors/predecessors are in sync?

1517

I think LLVM's convention is to not use Foo != nullptr, but just Foo.

lib/Transforms/Vectorize/VPlanPredicator.cpp
108

nit: just return Worklist.front()?

181

not needed I think, as this is the first check in genPredicateTree too

245

Is there a reason why we treat VPLoopInfo and VPDomTree differently here; i.e. fetch one from Plan and recompute the other? In a way both are tied together (if the DT changes, LoopInfo potentially also changes), so could we treat them in the same way, either fetch them both from Plan or recompute them here?

lib/Transforms/Vectorize/VPlanPredicator.h
23

Is this needed here? I cannot see any references to stuff defined in the generic dominators.h

29

could this be an enum class? That has the advantage that the type system guarantees that its value can only have predefined values.

37

We always have a Plan here, could it be a reference?

unittests/Transforms/Vectorize/VPlanPredicatorTest.cpp
15

It would be good to have tests to cover the important cases. It looks like this test only checks the AND predicate generation. Could you also test: generation of negative predicates, generation of ORs?

sguggill updated this revision to Diff 178326.Dec 14 2018, 5:40 PM
sguggill marked 10 inline comments as done.

Thanks for the review Florian. I am addressing your comments.

lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
26

I have added issue 40034 and also added it to our meta ticket.

lib/Transforms/Vectorize/VPlan.h
534

Yes, the verifier checks that successors/predecessors are in sync.

lib/Transforms/Vectorize/VPlanPredicator.cpp
108

I typically write code this way so that it is easier to print the values during debugging. Unless you insist, I prefer the code this way.

245

As mentioned in the comments, the dominator information will be computed per region and stored in the region block so that clients that need the same can reuse it. This is not being done currently. This information in a region is expected to be kept up-to-date or computed on demand if it is no longer valid.

VPLoopInfo is for the whole VPlan and we are currently computing it and storing it after building the HCFG. Once the dominator information becomes available in a region block, we can simply fetch it from a region and use the same. Until then we need to compute the dominator information.

unittests/Transforms/Vectorize/VPlanPredicatorTest.cpp
15

I will add a test that checks other cases.

fhahn added inline comments.Dec 14 2018, 5:45 PM
lib/Transforms/Vectorize/VPlanPredicator.cpp
47

If we EdgeType only has TRUE_EDGE and FALSE_EDGE and we turn this into a switch, we will get a nice compiler warning about missing cases if we add more elements to EdgeType.

245

Ok, sounds good.

sguggill updated this revision to Diff 178332.Dec 14 2018, 6:37 PM
sguggill marked an inline comment as done.
sguggill added inline comments.
lib/Transforms/Vectorize/VPlanPredicator.cpp
47

I have modified EdgeType to only contain TRUE_EDGE and FALSE_EDGE. However, in the switch I went with a default case where we call llvm_unreachable to get a compiler runtime error if we see an edgetype that is not handled.

fhahn added inline comments.Dec 17 2018, 2:30 PM
lib/Transforms/Vectorize/VPlanPredicator.cpp
47

IMO it is better to give the compiler the chance to warn/error on this, rather than a runtime check. With the default case, the compiler cannot warn any more, because everything is handled. (There are public bots with warnings as error, which should catch it).

sguggill updated this revision to Diff 181415.Jan 11 2019, 6:56 PM

Removed default case when checking for edge type and added a test checking generation of not and or during predication.

sguggill marked an inline comment as done.Jan 11 2019, 6:57 PM
fhahn accepted this revision.Jan 12 2019, 10:50 AM

LGTM, thanks! Please wait a bit with committing, in case anybody else has additional comments.

unittests/Transforms/Vectorize/VPlanPredicatorTest.cpp
215

Please check the operands of the OR here as well.

This revision is now accepted and ready to land.Jan 12 2019, 10:50 AM
sguggill updated this revision to Diff 182616.Jan 18 2019, 2:51 PM

Added checks for the operands of OR in the predicator test.

sguggill marked an inline comment as done.Jan 18 2019, 2:52 PM

Thanks Florian. I will request Hideki Saito to commit the change.

Closed by commit rL351990 (authored by hsaito). · Explain WhyJan 23 2019, 2:43 PM
This revision was automatically updated to reflect the committed changes.