This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] don't try to query getSCEV() for incomplete PHIs
ClosedPublic

Authored by shchenz on Apr 6 2020, 8:40 AM.

Details

Summary

SCEVExpander::getAddRecExprPHILiterally might call itself recursively (when calling expandIVInc), and in the process attempts to create a SCEV for the PHINode it is constructing.

For below PHI SCEV:
{(-4 + %.ap)<nsw>,+,(4 * (1 + (-1 * %_val_n_) + %_val_lda_)),+,4}<%_loop_1_do_>

In getAddRecExprPHILiterally, it will first create an incomplete PHI without any incomming values. And then it will recurisivly undirectly call getAddRecExprPHILiterally to expand another PHI node for step SCEV (4 * (1 + (-1 * %_val_n_) + %_val_lda_)),+,4 and add the expanded PHINode to the incomplete PHI's incomming value list.

The issue happens in expansion of the step SCEV. Before do literally expansion in SCEVExpander in getAddRecExprPHILiterally, it will first try to find an existing PHI for reusing. It uses SCEV to do the equality check. So need to getSCEV() for all PHIs in the same block including the incomplete PHI we just created above.

When we call getSCEV() for an incomplete PHI, like %lsr.iv231 = phi [0 x %_elem_type_of_ap]*, this PHI node will be simplified to undef while calling getSCEV(). the pair stored in ExprToIVMap for %lsr.iv231 is like {undef, SCEVUnknown}. Even after %lsr.iv231 is populated with all its incoming values, the pair stored in ExprToIVMap for %lsr.iv231 is not changed.

So if we have more than one incomplete PHI like above simplified to undef, they are all mapped to same SCEVUnknown object. So the PHIs are all treated as same if we use ExprToIVMap to query.

This is not true, we don't even know the incoming values for the PHIs.

This patch tries to fix this to don't call getSCEV() for incomplete PHIs per @mkazantsev 's comments

Diff Detail

Event Timeline

w2yehia created this revision.Apr 6 2020, 8:40 AM
w2yehia edited the summary of this revision. (Show Details)Apr 6 2020, 8:41 AM
w2yehia edited the summary of this revision. (Show Details)
w2yehia edited the summary of this revision. (Show Details)Apr 6 2020, 9:01 AM

I prefer solution 1 if SCEV for an incomplete PHI is meaningless.

%incomplete_1 = phi i32 [0, %entry] will be simplified to undef in createNodeForPHI()->SimplifyInstruction(), and for every incomplete PHI like above will be mapped to same SCEVUnknown object in ValueExprMap and will not be updated even after the PHI is populated with all its incoming values. So SCEV returns the wrong cached SCEVUnknown value to the populated PHI values.

This is missing tests.

w2yehia updated this revision to Diff 256165.Apr 8 2020, 8:11 PM

unit test added

Any chance this can get reviewed today. Thanks.

The change in ScalarEvolution looks good to me.

Some comments:
1: You might need to modify the summary a little bit especially for the testcases part.
2: I am a little concern about the unitcase in unittests/Analysis/ScalarEvolutionTest.cpp, It hides too many details in scalar evolution and not shows the bug well. I cooked a case for your reference. Better to commit one nfc patch first with current behavior, then rebase this one, and we can get a better show about what the bug is.

+TEST_F(ScalarEvolutionsTest, SCEVIncompletePHI) {
+  Module NIM("nonintegral", Context);
+  std::string DataLayout = M.getDataLayoutStr();
+  if (!DataLayout.empty())
+    DataLayout += "-";
+  DataLayout += "ni:10";
+  NIM.setDataLayout(DataLayout);
+
+  Type *T_int64 = Type::getInt64Ty(Context);
+  Type *T_int1 = Type::getInt1Ty(Context);
+  Type *T_pint64 = T_int64->getPointerTo(10);
+
+  FunctionType *FTy =
+      FunctionType::get(Type::getVoidTy(Context), {T_pint64}, false);
+  Function *F = Function::Create(FTy, Function::ExternalLinkage, "foo", NIM);
+
+  BasicBlock *Top = BasicBlock::Create(Context, "top", F);
+  BasicBlock *LPh = BasicBlock::Create(Context, "L.ph", F);
+  BasicBlock *L = BasicBlock::Create(Context, "L", F);
+  BasicBlock *Post = BasicBlock::Create(Context, "post", F);
+
+  IRBuilder<> Builder(Top);
+  Builder.CreateBr(LPh);
+
+  Builder.SetInsertPoint(LPh);
+  Builder.CreateBr(L);
+
+  Builder.SetInsertPoint(L);
+  // Create empty PHIs
+  PHINode *Phi = Builder.CreatePHI(T_int64, 2);
+  PHINode *Phi2 = Builder.CreatePHI(T_int64, 2);
+
+  ScalarEvolution SE = buildSE(*F);
+
+  const SCEV *OldS = SE.getSCEV(Phi);
+  const SCEV *OldS2 = SE.getSCEV(Phi2);
+
+  // No simplify here, so different SCEVUnknown for unpopulated PHIs.
+  EXPECT_NE(OldS, OldS2);
+
+  auto *Add = cast<Instruction>(
+      Builder.CreateAdd(Phi, ConstantInt::get(T_int64, 1), "add"));
+  Builder.CreateCondBr(UndefValue::get(T_int1), L, Post);
+
+  // Populate PHIs with different start values.
+  Phi->addIncoming(ConstantInt::get(T_int64, 0), LPh);
+  Phi->addIncoming(Add, L);
+
+  Phi2->addIncoming(ConstantInt::get(T_int64, 100), LPh);
+  Phi2->addIncoming(Add, L);
+  Builder.SetInsertPoint(Post);
+  Builder.CreateRetVoid();
+
+  // Get SCEVUnknown for populated PHIs through getUnknown()
+  const SCEV *getUnknownNewS = SE.getUnknown(Phi);
+  const SCEV *getUnknownNewS2 = SE.getUnknown(Phi2);
+
+  EXPECT_NE(getUnknownNewS, getUnknownNewS2);
+  // Get the SCEVUnknown object in cache UniqueSCEVs.
+  EXPECT_EQ(getUnknownNewS, OldS);
+  EXPECT_EQ(getUnknownNewS2, OldS2);
+
+  // Get SCEVUnknown for populated PHIs through getSCEV()
+  const SCEV *getSCEVNewS = SE.getSCEV(Phi);
+  const SCEV *getSCEVNewS2 = SE.getSCEV(Phi2);
+  EXPECT_EQ(getSCEVNewS, getUnknownNewS);
+  EXPECT_EQ(getSCEVNewS2, getUnknownNewS2);
+}
+

Thanks Zheng !! Your testcase is indeed more focused on the exact issue in getSCEV.
I will wait to get more feedback on the correctness of the fix. Once it's decided that it's the correct fix, I will post an NFC review that adds your testcase with the "old behavior" expectations, and then this patch will amend the testcase.

gentle ping. Thanks

gentle ping. Thanks

gentle ping. Thanks

gentle ping.
Just to summarize. The testcase demonstrating the issue is added to llvm/unittests/Analysis/ScalarEvolutionTest.cpp.
The description of what the issue is, is in the "The problem" section in the description of this review.
The solution implemented in this review is to detect when an "incomplete" PHI is being createSCEV'ed and return a SCEVUnknown.
Feedback is greatly appreciated.

shchenz added a subscriber: sanjoy.May 13 2020, 7:17 PM

I think we should fix this SCEV bug as it leads to runtime functionality issue. Could @sanjoy or any experts familar with the SCEV have a look at this issue?

I think we should just never query SCEVs for incomplete Phis as this operation makes no sense. It should be a bug in SCEV expander. Do you know why it makes this query to this Phi?

Let's frame it this way: please add an XFAIL test which runs some passes and fails an assertion with the situation you are describing. I think hiding the bug by returning SCEVUnknown is just wrong. We should investigate why we are making this query instead.

shchenz commandeered this revision.Jul 22 2020, 3:19 AM
shchenz edited reviewers, added: w2yehia; removed: shchenz.
mkazantsev requested changes to this revision.Jul 28 2020, 5:15 AM
This revision now requires changes to proceed.Jul 28 2020, 5:15 AM
shchenz updated this revision to Diff 281497.Jul 29 2020, 3:23 AM
shchenz edited the summary of this revision. (Show Details)

address @mkazantsev 's comments:
1: don't call getSCEV() for incomplete PHIs

shchenz edited the summary of this revision. (Show Details)Jul 29 2020, 3:26 AM
shchenz retitled this revision from [SCEV] CreateNodeForPhi should return SCEVUnknown for incomplete PHIs to [SCEV] don't try to query getSCEV() for incomplete PHIs.
lebedev.ri added inline comments.Jul 29 2020, 5:06 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1192

I don't think this is correct fix. It is perfectly legal to have e.g. the following:

pred:
 <...>
 br label %succ
succ:
 %10 = phi i32 [ 0, %pred ], [ 0, %pred ]
shchenz added inline comments.Jul 29 2020, 5:44 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1192

Here we want to identify an incomplete PHI.
The incomplete PHI is like %10 = phi i32, hasNPredecessors should be set when the PHI node is created, so it is 2.

PHINode *PN = Builder.CreatePHI(ExpandTy, std::distance(HPB, HPE),
                                Twine(IVName) + ".iv");

But its getNumIncomingValues() is 0 because it is not populated yet.

Do you have any other ways to identify an incomplete PHI? @lebedev.ri

shchenz added inline comments.Jul 29 2020, 5:54 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1192

should "predecessors number of PHI's parent is bigger than PHI's incoming value number" be a right way? @lebedev.ri

Somewhat better test (although needs further manual cleanup), that produces different results with/without the fix:

; RUN: opt %s -loop-reduce -S | FileCheck %s

target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%0 = type <{ float }>

define void @foo([0 x %0]* %arg) {
bb:
  %i = getelementptr inbounds [0 x %0], [0 x %0]* %arg, i64 0, i64 -1
  %i1 = bitcast %0* %i to i8*
  %i2 = getelementptr i8, i8* %i1, i64 4
  br label %bb3

bb3:                                              ; preds = %bb18, %bb
  %i4 = phi i64 [ %i20, %bb18 ], [ 0, %bb ]
  %i5 = phi i64 [ %i21, %bb18 ], [ 1, %bb ]
  br label %bb6

bb6:                                              ; preds = %bb3
  br i1 undef, label %bb7, label %bb8

bb7:                                              ; preds = %bb17, %bb6
  br label %bb22

bb8:                                              ; preds = %bb6
  br label %bb9

bb9:                                              ; preds = %bb9, %bb8
  %i10 = phi i64 [ 0, %bb8 ], [ %i16, %bb9 ]
  %i11 = add i64 %i10, %i4
  %i12 = shl i64 %i11, 2
  %i13 = getelementptr i8, i8* %i2, i64 %i12
  %i14 = bitcast i8* %i13 to float*
  %i15 = bitcast float* %i14 to <4 x float>*
  store <4 x float> undef, <4 x float>* %i15, align 4
  %i16 = add i64 %i10, 32
  br i1 true, label %bb17, label %bb9

bb17:                                             ; preds = %bb9
  br i1 undef, label %bb18, label %bb7

bb18:                                             ; preds = %bb17
  %i19 = add i64 undef, %i4
  %i20 = add i64 %i19, %i5
  %i21 = add nuw nsw i64 %i5, 1
  br label %bb3

bb22:                                             ; preds = %bb22, %bb7
  %i23 = phi i64 [ %i26, %bb22 ], [ undef, %bb7 ]
  %i24 = add nsw i64 %i23, %i4
  %i25 = getelementptr %0, %0* %i, i64 %i24, i32 0
  store float undef, float* %i25, align 4
  %i26 = add nuw nsw i64 %i23, 1
  br label %bb22
}

I would think getSCEV() should be returning nullptr if it is asked about an instruction that is currently being created, but i don't know.

shchenz updated this revision to Diff 281576.Jul 29 2020, 7:28 AM

address @lebedev.ri comments:
1: use right way to identify an incomplete phi node

Somewhat better test (although needs further manual cleanup), that produces different results with/without the fix:

; RUN: opt %s -loop-reduce -S | FileCheck %s

target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%0 = type <{ float }>

define void @foo([0 x %0]* %arg) {
bb:
  %i = getelementptr inbounds [0 x %0], [0 x %0]* %arg, i64 0, i64 -1
  %i1 = bitcast %0* %i to i8*
  %i2 = getelementptr i8, i8* %i1, i64 4
  br label %bb3

bb3:                                              ; preds = %bb18, %bb
  %i4 = phi i64 [ %i20, %bb18 ], [ 0, %bb ]
  %i5 = phi i64 [ %i21, %bb18 ], [ 1, %bb ]
  br label %bb6

bb6:                                              ; preds = %bb3
  br i1 undef, label %bb7, label %bb8

bb7:                                              ; preds = %bb17, %bb6
  br label %bb22

bb8:                                              ; preds = %bb6
  br label %bb9

bb9:                                              ; preds = %bb9, %bb8
  %i10 = phi i64 [ 0, %bb8 ], [ %i16, %bb9 ]
  %i11 = add i64 %i10, %i4
  %i12 = shl i64 %i11, 2
  %i13 = getelementptr i8, i8* %i2, i64 %i12
  %i14 = bitcast i8* %i13 to float*
  %i15 = bitcast float* %i14 to <4 x float>*
  store <4 x float> undef, <4 x float>* %i15, align 4
  %i16 = add i64 %i10, 32
  br i1 true, label %bb17, label %bb9

bb17:                                             ; preds = %bb9
  br i1 undef, label %bb18, label %bb7

bb18:                                             ; preds = %bb17
  %i19 = add i64 undef, %i4
  %i20 = add i64 %i19, %i5
  %i21 = add nuw nsw i64 %i5, 1
  br label %bb3

bb22:                                             ; preds = %bb22, %bb7
  %i23 = phi i64 [ %i26, %bb22 ], [ undef, %bb7 ]
  %i24 = add nsw i64 %i23, %i4
  %i25 = getelementptr %0, %0* %i, i64 %i24, i32 0
  store float undef, float* %i25, align 4
  %i26 = add nuw nsw i64 %i23, 1
  br label %bb22
}

I would think getSCEV() should be returning nullptr if it is asked about an instruction that is currently being created, but i don't know.

Thanks very much for the case. I will update it later. @lebedev.ri

For the solution, the first try for this issue is to return SCEVUnknown for an incomplete PHI before this PHI get simplified. So different incomplete PHI get different SCEVUnknown objects. I think this is a little like your propose about returning nullptr, right?

I think there are several issues for this case:
1: SCEVExpander should not call getSCEV for incomplete PHIs. We can never reuse an incomplete PHI. This is our current fix.
2: if the instructions are still being built, getSCEV() should not simplify them and then return a SCEVUnknown object. Different incomplete instructions may have the same simplifying result and then will get same SCEV expression. But after finishing building the instructions, the instructions could be different.
3: When we create a SCEVUnknown object for same instruction, should we update the ExprToIVMap cache? The first time we create SCEVUnknown (say ObjectA) for an incomplete instruction(say InstructionA), we cache them with {ObjectA, InstructionA}. After finishing building the instructions, we may direct call getUnknown(PHI) for InstructionA, this time SCEV will create a new SCEVUnknown object, (say ObjectB) for the same instruction, but we will not update the ExprToIVMap cache. This should be another issue. If you call getUnknown many times, you may get different unknown objects, but in the cache ExprToIVMap, it is always the legacy one.

Currently we choose the first way to solve this issue per @mkazantsev comments and I think it makes more sense.
But I think issue 2 and issue 3 should still be potential bugs inside scev. But now we don't have cases to expose them.

shchenz edited the summary of this revision. (Show Details)Jul 29 2020, 7:50 AM
mkazantsev added inline comments.Jul 30 2020, 1:20 AM
llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1192

If we catch phi in mid-construction (when we populated some of inputs but not all of them), it may have more predecessors than inputs (and still be incomplete). I think PN.isComplete() is the correct fix. Roman, why do you think querying SCEV in your example is OK?

shchenz updated this revision to Diff 281831.Jul 30 2020, 1:46 AM

recook a better case from @lebedev.ri

shchenz updated this revision to Diff 281875.Jul 30 2020, 4:38 AM

fix code format

mkazantsev accepted this revision.Jul 31 2020, 3:01 AM

LGTM. Please wait for Roman if he has any objections.

This revision is now accepted and ready to land.Jul 31 2020, 3:01 AM
lebedev.ri accepted this revision.Jul 31 2020, 4:06 AM

I still think this should be solved on a different level, but LG i guess.

llvm/include/llvm/IR/Instructions.h
2749

Can any code be cleaned up to use this?

llvm/lib/IR/Instructions.cpp
163–174 ↗(On Diff #281875)
shchenz updated this revision to Diff 282374.Jul 31 2020, 11:35 PM

refactor according to @lebedev.ri comments

llvm/include/llvm/IR/Instructions.h
2749

Had a look at the code for getBasicBlockIndex, did not find a suitable user.

llvm/lib/IR/Instructions.cpp
163–174 ↗(On Diff #281875)

Like this refactor

This revision was landed with ongoing or failed builds.Jul 31 2020, 11:50 PM
This revision was automatically updated to reflect the committed changes.