Page MenuHomePhabricator

[SCEV] Guard movement of insertion point for loop-invariants (take 2)
ClosedPublic

Authored by wristow on Jan 29 2019, 5:12 PM.

Details

Summary

This reinstates r347934, along with a tweak to address a problem with PHI node ordering that that commit created (or exposed). That commit was reverted at r348426, due to the PHI node issue.

Original commit message:

r320789 suppressed moving the insertion point of SCEV expressions with
dev/rem operations to the loop header in non-loop-invariant situations.
This, and similar, hoisting is also unsafe in the loop-invariant case,
since there may be a guard against a zero denominator. This is an
adjustment to the fix of r320789 to suppress the movement even in the
loop-invariant case.

This fixes PR30806.

Here is a reduced test-case that demonstrates the PHI node issue:

extern void foo(char *, unsigned , unsigned *);
extern void bar(int *, long);
extern char *processBuf(char *);

extern unsigned theSize;

void foo(char *buf, unsigned denominator, unsigned *flag) {
  int incr = (int) (theSize / denominator);
  int inx = 0;
  while (*flag) {
    int itmp = inx + incr;
    int i = (int) theSize;
    bar(&i, (long) itmp);
    buf = processBuf(buf);
    inx = itmp;
  }
}

When compiling this test-case at -O2 using r347934, an instruction is placed between PHI nodes during Induction Variable Simplification, and that causes an asssertion to fail later on:

$ clang -c -O2 tst.c
  ...
llvm/include/llvm/IR/Instructions.h:2714: llvm::Value* llvm::PHINode::getOperand(unsigned int) const: Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed.
  ...
$

The test-case "pr30806-phi-scev.ll" in this proposed patch shows the issue when running opt with -indvars:

$ opt -S -indvars -o x.ll pr30806-phi-scev.ll
PHI nodes not grouped at top of basic block!
  %buf.addr.07 = phi i8* [ %buf, %while.body.lr.ph ], [ %call, %while.body ]
label %while.body
in function foo
LLVM ERROR: Broken function found, compilation aborted!
$

To be clear, this PHI node problem doesn't happen with current ToT, because r347934 was reverted at r348426 (although the integer-div-by-0 problem of PR30806 does still happen with ToT). The patch proposed here is the original work of r347934, with an adjustment to the insertion-point, by checking whether the insertion-point is at a PHI node. Specifically, it adds the following two lines of code (along with an additional test-case) to the commit of r347934:

if (isa<PHINode>(*InsertPt))
  InsertPt = &*InsertPt->getParent()->getFirstInsertionPt();

Bluntly, I feel this is safe, but I wonder whether it is "brute-force". That is, when the InsertPt is a PHINode, it's not legal -- and calling getFirstInsertPt() as shown above will fix it. But I wonder whether the underlying bug is that the InsertPt should be guaranteed to never be a PHINode at that point in the code?

To put it another way, this problem didn't manifest prior to r347934 (or from a historical perspective, the same issue appeared when a similar SCEV fix was made at r289412, but which was reverted shortly thereafter at r289453). Suppressing the hoisting of some expressions (which is what r347934 did) seems like it should be safe. So it feels like there is a deeper issue in that suppressing the hoisting (because SafeToHoist returned false) surprisingly resulted in the placement of a non-PHI-node instruction between two PHI-nodes. I thought if I could come up with a test-case that has this PHI-node issue irrespective of whether SafeToHoist returned true or false (that is, even before r347934 was made), I might have better luck analyzing it and finding a deeper root cause. But I was unable to find a test-case that triggered this when the SafeToHoist returned true. From that perspective, I considered only adding the two new lines that move the InsertPt when SafeToHoist was false, but if the addition of the above two lines is a "proper solution", it seems like it's perfectly reasonable to do that irrespective of what SafeToHoist returns.

Diff Detail

Repository
rL LLVM

Event Timeline

wristow created this revision.Jan 29 2019, 5:12 PM
wristow added a subscriber: dlj.Jan 29 2019, 5:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2019, 7:03 PM

I did not look into this too deeply, but the problem you're describing sounds like it can be solved by setting InsertPt to getFirstNonPHI at some point.

I did not look into this too deeply, but the problem you're describing sounds like it can be solved by setting InsertPt to getFirstNonPHI at some point.

Thanks for the comment @mkazantsev. I think the two lines I added (relative to the commit of r347934, which exposed this PHINode issue):

if (isa<PHINode>(*InsertPt))
  InsertPt = &*InsertPt->getParent()->getFirstInsertionPt();

are doing essentially that (with a slight tweak). Specifically, that call to getFirstInsertionPt() does call getFirstNonPHI(), and then also checks to see whether that gets to a LandingPad instruction (and if so, skips over the LandingPad, too). Since the other adjustments to InsertPt in this code also use this "wrapper" of getFirstInsertionPt() rather than calling getFirstNonPHI() directly, I felt safer using that approach.

Apologies for the late review Warren.

I agree that it is weird that the insert point is a PHI node. Have you traced through the code to see how we're ending up in that situation? Is IndVarSimplify choosing this insertion point?

Apologies for the late review Warren.

I agree that it is weird that the insert point is a PHI node. Have you traced through the code to see how we're ending up in that situation? Is IndVarSimplify choosing this insertion point?

Thanks for the comment, Sanjoy.

Loosely, yes, IndVarSimplify is choosing this insertion point. To elaborate, it selects that insertion point in WidenIV::createWideIV(), in the following code:

// The rewriter provides a value for the desired IV expression. This may
// either find an existing phi or materialize a new one. Either way, we
// expect a well-formed cyclic phi-with-increments. i.e. any operand not part
// of the phi-SCC dominates the loop entry.
Instruction *InsertPt = &L->getHeader()->front();
WidePhi = cast<PHINode>(Rewriter.expandCodeFor(AddRec, WideType, InsertPt));

where the header Basic Block starts out with 2 PHI nodes.

(Side note: The AddRec here is:

{0,+,(sext i32 (%0 /u %denominator) to i64)}<nsw><%while.body>

More on this point below.)

To me, it doesn't feel like changing the above code to use getFirstInsertionPt() on the parent BB of InsertPt is a good approach. I just experimented with that now, and I found it solved the first instance of the insertion point being a PHI node, but as the above AddRec was recursively processed, it eventually called SCEVExpander::getAddRecExprPHILiterally(), which has the following code:

// Expand the step somewhere that dominates the loop header.
Value *StepV = expandCodeFor(Step, IntTy, &L->getHeader()->front());

That &L->getHeader()->front() arg to expandCodeFor() is passed to setInsertPoint(), and this again causes the same trouble. If I change this to also get the getFirstInsertionPt() on the parent BB of &L->getHeader()->front(), then my test-case passes.

So tweaking these two points (WidenIV::createWideIV() and SCEVExpander::getAddRecExprPHILiterally()) is another possible solution to investigate. But (a) I don't feel like these changes are safe (but I'm not experienced in this area), and (b) I wonder whether there are other places that would need similar changes but just aren't triggered by my test-case.

One further observation about the AddRec:

{0,+,(sext i32 (%0 /u %denominator) to i64)}<nsw><%while.body>

All that is really being attempted here is to hoist the sign-extension of the quotient out of a loop. The actual division is before the loop in the input .ll file (and it was before the loop in the original C++ code). In other words, we aren't trying to hoist an unsafe division out of the loop because the division never was even in the loop. But SafeToHoist() isn't aware of that -- it "sees" the /u in the expression, and decides it shouldn't hoist it. If instead, we could somehow know that the only expression that we're considering hoisting is the sext (and then allow that expression to be hoisted), then all would be fine (FTR, the sext eventually does get hoisted by other optimizations). I say this with the caveat that I don't deeply understand this SCEV mechanism.

sanjoy accepted this revision.Mar 16 2019, 2:00 PM

Ok, looks like we're somewhat sloppy with our choice of insertion points here (and it should not be incumbent on you to fix this). Can you please add a comment that mentions that this is due to indvarsimplify?

This revision is now accepted and ready to land.Mar 16 2019, 2:00 PM

Thanks for the review @sanjoy . I've updated it with a comment about IndVarSimplify, and will commit.

This revision was automatically updated to reflect the committed changes.