This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Transforms/Vectorize/
-
lib/
-
Transforms/
-
Vectorize/
2/6
LoopVectorize.cpp

Differential D80086

[VPlan] Fix wrong vec.phi generation
AbandonedPublic

Authored by jaykang10 on May 17 2020, 9:14 AM.

Download Raw Diff

Details

Reviewers

fhahn
hsaito
rengolin
Ayal
gilr

Summary

https://bugs.llvm.org/show_bug.cgi?id=45958

Let's see simple IR snippet after loop vectorization with VPlanNatviePath.

vector.body:
  br label %for.body10.preheader67

for.body10.preheader67:                     ; preds =
%for.cond.cleanup972, %vector.body
  %vec.phi = phi <4 x i64> [ zeroinitializer, %for.cond.cleanup972 ],
[ %8, %vector.body ]

for.cond.cleanup972:                              ; preds = %for.body1068
  %8 = add nuw nsw <4 x i64> %vec.phi, <i64 1, i64 1, i64 1, i64 1>
  br i1 %10, label %for.cond.cleanup373, label %for.body10.preheader67

As you can see, %vec.phi has wrong incoming basic blocks.

The problem comes from "InnerLoopVectorizer::fixNonInductionPHIs()".
This function has assumption about the order of predecessors as below.

// The predecessor order is preserved and we can rely on mapping between
// scalar and vector block predecessors.
for (unsigned i = 0; i < NumIncomingValues; ++i) {

The original phi's order of predecessors can be different with the new phi's one.

Diff Detail

Event Timeline

jaykang10 created this revision.May 17 2020, 9:14 AM

Herald added a reviewer: rengolin. · View Herald TranscriptMay 17 2020, 9:14 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, vkmr, psnobl and 4 others. · View Herald Transcript

jaykang10 edited the summary of this revision. (Show Details)May 17 2020, 9:16 AM

Thanks for the patch. Could you add a test case for the crash?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4068	This seems a bit more complicated than necessary I think IIUC, if we have a map ScalarPred -> VectorPred (lets call it ScalarPred2VectorPred) we can get the blocks in the loop as follows: BasicBlock ScalarBB = OrigPhi->getIncomingBlock(i); BasicBlock VectorBB = ScalarPred2VectorPred[ScalarBB]; // (also assert that ScalarBB is in the map) Building ScalarPred2VectorPred should be straight-forward IIUC, as we should be able to rely on the predecessor order being the same between vector and scalar versions: just set `ScalarPred2VectorPred[ScalarBBPredecessors[I]] = VectorBBPredecessors[I];` You also don't have to ScalarBBPredecessors/VectorBBPredecessors directly. You can just iterate over the predecessors for both and populate the map (you might want to use llvm::zip from STLExtras.h)

This revision now requires changes to proceed.May 17 2020, 10:14 AM

Oh, and could you please add a more descriptive title?

jaykang10 updated this revision to Diff 264512.May 17 2020, 2:05 PM

jaykang10 retitled this revision from [VPlan] Fix bug45985 to [VPlan] Fix wrong vec.phi generation.

jaykang10 marked an inline comment as done.May 17 2020, 2:42 PM

jaykang10 added inline comments.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4068	As you mentioned, I could create the map to keep the scalar BB and vector BB... I thought the dominate tree simpler solution to fix this bug because we need to change only this function rather than places where we need to update the new map... I would like to get other reviewers' opinion about this...

fhahn added inline comments.May 17 2020, 3:38 PM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4068	I think the map change would also be function local. We only need to map the scalar predecessors Here of a phi to the vector predecessors, as suggested in the comment. There should be no need to maintain it before/after this function

jaykang10 marked an inline comment as not done.May 21 2020, 12:31 PM

jaykang10 marked an inline comment as done.May 21 2020, 12:50 PM

jaykang10 added inline comments.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4068	I am sorry for late. Let's see the basic blocks of the phi from attached example. OrigPhi's BasicBlock dump for.body10.preheader: ; preds = %for.body4.lr.ph, %for.cond.cleanup9 %indvars.iv60 = phi i64 [ 0, %for.body4.lr.ph ], [ %indvars.iv.next61, %for.cond.cleanup9 ] ... In this basic block's loop %for.body4.lr.ph is preheader %for.cond.cleanup9 is latch NewPhi's BasicBlock dump for.body10.preheader1: ; preds = %for.cond.cleanup96, %vector.body** %vec.phi = phi <4 x i64> ... In this basic block's loop %for.cond.cleanup96 is latch %vector.body is preheader As you can see, the order of predecessor is different between the scalar and vector basic blocks. The scalar one's predecessor order is preheader first and latch next but the vector one is opposite. I am not sure whether it works to just map the scalar basic block to vector basic block in this function or not. I guess we could map it where we create the vector phi or somewhere we have the correct information. If I missed something, please let me know.

jaykang10 updated this revision to Diff 265582.May 21 2020, 12:53 PM

jaykang10 marked an inline comment as done.May 21 2020, 1:17 PM

jaykang10 added inline comments.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4068	Additiionally, for the vector basic block, the first predecessor is generated by creating preheader's terminator. The second one is generated by setting up latch's successor.

fhahn added inline comments.Jun 1 2020, 11:48 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4068	Sorry for the delay on this! I initially thought there was a mismatch between the predecessors and the order in the original PHI, but that's not the case. I am currently looking into how to fix the code that is supposed to ensure that the order of the predecessors in VP matches the order in the original IR.

jaykang10 abandoned this revision.Mar 6 2021, 11:00 AM

Herald added a subscriber: tschuett. · View Herald TranscriptMar 6 2021, 11:00 AM

rkruppe removed a subscriber: rkruppe.Mar 7 2021, 12:41 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Vectorize/

LoopVectorize.cpp

24 lines

Diff 264500

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,042 Lines • ▼ Show 20 Lines	assert(ScalarBBPredecessors.size() == VectorBBPredecessors.size() &&
"Scalar and Vector BB should have the same number of predecessors");		"Scalar and Vector BB should have the same number of predecessors");

// The insertion point in Builder may be invalidated by the time we get		// The insertion point in Builder may be invalidated by the time we get
// here. Force the Builder insertion point to something valid so that we do		// here. Force the Builder insertion point to something valid so that we do
// not run into issues during insertion point restore in		// not run into issues during insertion point restore in
// getOrCreateVectorValue calls below.		// getOrCreateVectorValue calls below.
Builder.SetInsertPoint(NewPhi);		Builder.SetInsertPoint(NewPhi);

// The predecessor order is preserved and we can rely on mapping between		// The order of predecessors can be different between OrigPhi BB and NewPhi
// scalar and vector block predecessors.		// BB. Check dominance of incoming values and map incoming value and BB
		// according to the dominace.
		SmallVector<Value *, 2> NewIncomingValues;
		bool NeedSwap = false;
for (unsigned i = 0; i < NumIncomingValues; ++i) {		for (unsigned i = 0; i < NumIncomingValues; ++i) {
BasicBlock *NewPredBB = VectorBBPredecessors[i];		BasicBlock *NewPredBB = VectorBBPredecessors[i];

// When looking up the new scalar/vector values to fix up, use incoming		// When looking up the new scalar/vector values to fix up, use incoming
// values from original phi.		// values from original phi.
Value *ScIncV =		Value *ScIncV =
OrigPhi->getIncomingValueForBlock(ScalarBBPredecessors[i]);		OrigPhi->getIncomingValueForBlock(ScalarBBPredecessors[i]);

// Scalar incoming value may need a broadcast		// Scalar incoming value may need a broadcast
Value *NewIncV = getOrCreateVectorValue(ScIncV, 0);		Value *NewIncV = getOrCreateVectorValue(ScIncV, 0);

		// Check dominance of NewIncV with NewPredBB.
		if (Instruction *NewIncI = dyn_cast<Instruction>(NewIncV)) {
		fhahnUnsubmitted Not Done Reply Inline Actions This seems a bit more complicated than necessary I think IIUC, if we have a map ScalarPred -> VectorPred (lets call it ScalarPred2VectorPred) we can get the blocks in the loop as follows: BasicBlock ScalarBB = OrigPhi->getIncomingBlock(i); BasicBlock VectorBB = ScalarPred2VectorPred[ScalarBB]; // (also assert that ScalarBB is in the map) Building ScalarPred2VectorPred should be straight-forward IIUC, as we should be able to rely on the predecessor order being the same between vector and scalar versions: just set `ScalarPred2VectorPred[ScalarBBPredecessors[I]] = VectorBBPredecessors[I];` You also don't have to ScalarBBPredecessors/VectorBBPredecessors directly. You can just iterate over the predecessors for both and populate the map (you might want to use llvm::zip from STLExtras.h) fhahn: This seems a bit more complicated than necessary I think IIUC, if we have a map ScalarPred ->…
		jaykang10AuthorUnsubmitted Not Done Reply Inline Actions As you mentioned, I could create the map to keep the scalar BB and vector BB... I thought the dominate tree simpler solution to fix this bug because we need to change only this function rather than places where we need to update the new map... I would like to get other reviewers' opinion about this... jaykang10: As you mentioned, I could create the map to keep the scalar BB and vector BB... I thought the…
		fhahnUnsubmitted Not Done Reply Inline Actions I think the map change would also be function local. We only need to map the scalar predecessors Here of a phi to the vector predecessors, as suggested in the comment. There should be no need to maintain it before/after this function fhahn: I think the map change would also be function local. We only need to map the scalar…
		jaykang10AuthorUnsubmitted Done Reply Inline Actions I am sorry for late. Let's see the basic blocks of the phi from attached example. OrigPhi's BasicBlock dump for.body10.preheader: ; preds = %for.body4.lr.ph, %for.cond.cleanup9 %indvars.iv60 = phi i64 [ 0, %for.body4.lr.ph ], [ %indvars.iv.next61, %for.cond.cleanup9 ] ... In this basic block's loop %for.body4.lr.ph is preheader %for.cond.cleanup9 is latch NewPhi's BasicBlock dump for.body10.preheader1: ; preds = %for.cond.cleanup96, %vector.body %vec.phi = phi <4 x i64> ... In this basic block's loop %for.cond.cleanup96 is latch %vector.body is preheader As you can see, the order of predecessor is different between the scalar and vector basic blocks. The scalar one's predecessor order is preheader first and latch next but the vector one is opposite. I am not sure whether it works to just map the scalar basic block to vector basic block in this function or not. I guess we could map it where we create the vector phi or somewhere we have the correct information. If I missed something, please let me know. jaykang10:** I am sorry for late. Let's see the basic blocks of the phi from attached example. OrigPhi's…
		jaykang10AuthorUnsubmitted Done Reply Inline Actions Additiionally, for the vector basic block, the first predecessor is generated by creating preheader's terminator. The second one is generated by setting up latch's successor. jaykang10: Additiionally, for the vector basic block, the first predecessor is generated by creating…
		fhahnUnsubmitted Not Done Reply Inline Actions Sorry for the delay on this! I initially thought there was a mismatch between the predecessors and the order in the original PHI, but that's not the case. I am currently looking into how to fix the code that is supposed to ensure that the order of the predecessors in VP matches the order in the original IR. fhahn: Sorry for the delay on this! I initially thought there was a mismatch between the predecessors…
		if (!DT->dominates(NewIncI, NewPredBB)) {
		NeedSwap = true;
		}
		}
		NewIncomingValues.push_back(NewIncV);
		}

		if (NeedSwap) {
		std::swap(NewIncomingValues[0], NewIncomingValues[1]);
		}

		for (unsigned i = 0; i < NumIncomingValues; ++i) {
		BasicBlock *NewPredBB = VectorBBPredecessors[i];
		Value *NewIncV = NewIncomingValues[i];
NewPhi->addIncoming(NewIncV, NewPredBB);		NewPhi->addIncoming(NewIncV, NewPredBB);
}		}
}		}
}		}

void InnerLoopVectorizer::widenGEP(GetElementPtrInst *GEP, unsigned UF,		void InnerLoopVectorizer::widenGEP(GetElementPtrInst *GEP, unsigned UF,
unsigned VF, bool IsPtrLoopInvariant,		unsigned VF, bool IsPtrLoopInvariant,
SmallBitVector &IsIndexLoopInvariant) {		SmallBitVector &IsIndexLoopInvariant) {
▲ Show 20 Lines • Show All 3,994 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Fix wrong vec.phi generationAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 264500

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

[VPlan] Fix wrong vec.phi generation
AbandonedPublic