This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Fix wrong vec.phi generation
AbandonedPublic

Authored by jaykang10 on May 17 2020, 9:14 AM.

Details

Summary

https://bugs.llvm.org/show_bug.cgi?id=45958

Let's see simple IR snippet after loop vectorization with VPlanNatviePath.

vector.body:
  br label %for.body10.preheader67

for.body10.preheader67:                     ; preds =
%for.cond.cleanup972, %vector.body
  %vec.phi = phi <4 x i64> [ zeroinitializer, %for.cond.cleanup972 ],
[ %8, %vector.body ]

for.cond.cleanup972:                              ; preds = %for.body1068
  %8 = add nuw nsw <4 x i64> %vec.phi, <i64 1, i64 1, i64 1, i64 1>
  br i1 %10, label %for.cond.cleanup373, label %for.body10.preheader67

As you can see, %vec.phi has wrong incoming basic blocks.

The problem comes from "InnerLoopVectorizer::fixNonInductionPHIs()".
This function has assumption about the order of predecessors as below.

// The predecessor order is preserved and we can rely on mapping between
// scalar and vector block predecessors.
for (unsigned i = 0; i < NumIncomingValues; ++i) {

The original phi's order of predecessors can be different with the new phi's one.

Diff Detail

Event Timeline

jaykang10 created this revision.May 17 2020, 9:14 AM
Herald added a project: Restricted Project. · View Herald Transcript
jaykang10 edited the summary of this revision. (Show Details)May 17 2020, 9:16 AM
fhahn requested changes to this revision.May 17 2020, 10:14 AM
fhahn added reviewers: Ayal, gilr.

Thanks for the patch. Could you add a test case for the crash?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4072

This seems a bit more complicated than necessary I think

IIUC, if we have a map ScalarPred -> VectorPred (lets call it ScalarPred2VectorPred) we can get the blocks in the loop as follows:

BasicBlock *ScalarBB = OrigPhi->getIncomingBlock(i);
BasicBlock *VectorBB = ScalarPred2VectorPred[ScalarBB]; // (also assert that ScalarBB is in the map)

Building ScalarPred2VectorPred should be straight-forward IIUC, as we should be able to rely on the predecessor order being the same between vector and scalar versions: just set ScalarPred2VectorPred[ScalarBBPredecessors[I]] = VectorBBPredecessors[I]; You also don't have to ScalarBBPredecessors/VectorBBPredecessors directly. You can just iterate over the predecessors for both and populate the map (you might want to use llvm::zip from STLExtras.h)

This revision now requires changes to proceed.May 17 2020, 10:14 AM

Oh, and could you please add a more descriptive title?

jaykang10 updated this revision to Diff 264512.May 17 2020, 2:05 PM
jaykang10 retitled this revision from [VPlan] Fix bug45985 to [VPlan] Fix wrong vec.phi generation.
jaykang10 marked an inline comment as done.May 17 2020, 2:42 PM
jaykang10 added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4072

As you mentioned, I could create the map to keep the scalar BB and vector BB... I thought the dominate tree simpler solution to fix this bug because we need to change only this function rather than places where we need to update the new map... I would like to get other reviewers' opinion about this...

fhahn added inline comments.May 17 2020, 3:38 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4072

I think the map change would also be function local. We only need to map the scalar predecessors Here of a phi to the vector predecessors, as suggested in the comment. There should be no need to maintain it before/after this function

jaykang10 marked an inline comment as not done.May 21 2020, 12:31 PM
jaykang10 marked an inline comment as done.May 21 2020, 12:50 PM
jaykang10 added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4072

I am sorry for late. Let's see the basic blocks of the phi from attached example.

OrigPhi's BasicBlock dump

for.body10.preheader:        ; preds = %for.body4.lr.ph, %for.cond.cleanup9
  %indvars.iv60 = phi i64 [ 0, %for.body4.lr.ph ], [ %indvars.iv.next61, %for.cond.cleanup9 ]
...
In this basic block's loop
%for.body4.lr.ph is preheader
%for.cond.cleanup9 is latch

NewPhi's BasicBlock dump

for.body10.preheader1:       ; preds = %for.cond.cleanup96, %vector.body**
  %vec.phi = phi <4 x i64>
...
In this basic block's loop
%for.cond.cleanup96 is latch
%vector.body is preheader

As you can see, the order of predecessor is different between the scalar and vector basic blocks. The scalar one's predecessor order is preheader first and latch next but the vector one is opposite. I am not sure whether it works to just map the scalar basic block to vector basic block in this function or not. I guess we could map it where we create the vector phi or somewhere we have the correct information. If I missed something, please let me know.

jaykang10 marked an inline comment as done.May 21 2020, 1:17 PM
jaykang10 added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4072

Additiionally, for the vector basic block, the first predecessor is generated by creating preheader's terminator. The second one is generated by setting up latch's successor.

fhahn added inline comments.Jun 1 2020, 11:48 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4072

Sorry for the delay on this! I initially thought there was a mismatch between the predecessors and the order in the original PHI, but that's not the case. I am currently looking into how to fix the code that is supposed to ensure that the order of the predecessors in VP matches the order in the original IR.

jaykang10 abandoned this revision.Mar 6 2021, 11:00 AM
rkruppe removed a subscriber: rkruppe.Mar 7 2021, 12:41 AM