This is an archive of the discontinued LLVM Phabricator instance.

[indvars] Use fact loop must exit to canonicalize to unsigned conditions
ClosedPublic

Authored by reames on Oct 14 2021, 1:53 PM.

Details

Summary

The logic in this patch is that if we find a comparison which would be unsigned except for when the loop is infinite, and we can prove that an infinite loop must be ill defined, we can still make the predicate unsigned.

The eventual goal (combined with a follow on patch) is to use the fact the loop exits to remove the zext (see tests) entirely.

A couple of points worth noting:

  • We loose the ability to prove the loop unreachable by committing to the must exit interpretation. If instead, we later proved that rhs was definitely outside the range required for finiteness, we could have killed the loop entirely. (We don't currently implement this transform, but could in theory, do so.)
  • simplifyAndExtend has a very limited list of users it walks. In particular, in the examples is stops at the zext and never visits the icmp. (Because we can't fold the zext to an addrec yet in SCEV.) Being willing to visit when we haven't simplified regresses multiple tests (seemingly because of less optimal results when computing trip counts).
  • D109457 is geared at the same basic problem, but handles a non-fully overlapping set of cases.

Here's an Alive2 proof of a related property:

declare void @llvm.assume(i1)

define i1 @src(i32 noundef %x, i16 noundef %y) {

%and = and i32 %x, 65532
%add = add i32 %and, -1
%assume = icmp ne i32 %add, -1
call void @llvm.assume(i1 %assume)
%zext = zext i16 %y to i32
%res = icmp sgt i32 %add, %zext
ret i1 %res

}

define i1 @tgt(i32 noundef %x, i16 noundef %y) {

%and = and i32 %x, 65532
%add = add i32 %and, -1
%assume = icmp ne i32 %add, -1
call void @llvm.assume(i1 %assume)
%zext = zext i16 %y to i32
%res = icmp ugt i32 %add, %zext
ret i1 %res

}

(In this patch, the must exit property provides the assumption in the alive2 example. You can also vary the predicate type to see it holds for other signed predicates)

Diff Detail

Event Timeline

reames created this revision.Oct 14 2021, 1:53 PM
reames requested review of this revision.Oct 14 2021, 1:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2021, 1:53 PM
reames edited the summary of this revision. (Show Details)Oct 14 2021, 1:57 PM
reames added a subscriber: tgt.
nikic resigned from this revision.Oct 14 2021, 2:23 PM

I don't consider mustprogress-based transforms worthwhile...

reames added a subscriber: nikic.Oct 14 2021, 2:25 PM

I don't consider mustprogress-based transforms worthwhile...

Why?

mkazantsev added inline comments.Oct 15 2021, 12:10 AM
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1424

Looks like a great place to use pattern matching.

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll
367

Why does it need mustprogress notion to do this at all? Zext is non-negative, 254 is non-negative, we can always do that regardless of mustprogress. We have this logic in eliminateIVComparison:

} else if (ICmpInst::isSigned(OriginalPred) &&
           SE->isKnownNonNegative(S) && SE->isKnownNonNegative(X)) {
  // If we were unable to make anything above, all we can is to canonicalize
  // the comparison hoping that it will open the doors for other
  // optimizations. If we find out that we compare two non-negative values,
  // we turn the instruction's predicate to its unsigned version. Note that
  // we cannot rely on Pred here unless we check if we have swapped it.
  assert(ICmp->getPredicate() == OriginalPred && "Predicate changed?");
  LLVM_DEBUG(dbgs() << "INDVARS: Turn to unsigned comparison: " << *ICmp
                    << '\n');
  ICmp->setPredicate(ICmpInst::getUnsignedPredicate(OriginalPred));

Why it didn't work?

Some time ago i addempted to do this in CVP via constant range reasoning:
D90924 (ignore the SCEV part)
Would that be worthwhile? Should i pick that back up?

reames added inline comments.Oct 15 2021, 9:03 AM
llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll
367

See the comment in code about why this can't be done in simplifyAndExtend. Short version: we never visit the icmp due to the presence of the zext, and if we do, we get overall worse results due to poisoning of SCEV caches.

I tried that approach first, and gave up after getting tangled up in SCEV dance around partially constructed SCEVs and trip count logic.

reames added inline comments.Oct 15 2021, 9:36 AM
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1424

Its not. :) I need to mutate the condition of the ICmp below, and the pattern matching code is just more complicated than what is here. I tried it; it's ugly. :)

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll
367

I'd always planned to handle this in a follow up, but since you asked: D111896.

(We still have to work around the fact simplifyAndExtend never visits the icmp and avoid poisoning SCEV with sub-optimal results when we do.)

Some time ago i addempted to do this in CVP via constant range reasoning:
D90924 (ignore the SCEV part)
Would that be worthwhile? Should i pick that back up?

Yes, I think a CVP based form would be useful. We have the range information, we might as well use it.

If you want to implement the CVP part, I'd be happy to review. Alternatively, I can implement, and you can review. Your preference.

Some time ago i addempted to do this in CVP via constant range reasoning:
D90924 (ignore the SCEV part)
Would that be worthwhile? Should i pick that back up?

Yes, I think a CVP based form would be useful. We have the range information, we might as well use it.

If you want to implement the CVP part, I'd be happy to review. Alternatively, I can implement, and you can review. Your preference.

(CVP isn't the hard part, actually modelling it in ConstantRange is.)
I guess i just need to rebase the patch then.

Some time ago i addempted to do this in CVP via constant range reasoning:
D90924 (ignore the SCEV part)
Would that be worthwhile? Should i pick that back up?

Yes, I think a CVP based form would be useful. We have the range information, we might as well use it.

If you want to implement the CVP part, I'd be happy to review. Alternatively, I can implement, and you can review. Your preference.

(CVP isn't the hard part, actually modelling it in ConstantRange is.)
I guess i just need to rebase the patch then.

Why? The "both are positive" check should be pretty trivial given the two constant ranges for the operands? Are you looking to go much beyond that? If so, maybe split the changes?

mkazantsev added inline comments.Oct 17 2021, 10:59 PM
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1442

isSigned && isRelational?

reames updated this revision to Diff 380485.Oct 18 2021, 11:15 AM

Address review comments

mkazantsev added inline comments.Oct 19 2021, 8:19 AM
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1873

I have a deja vu. :) We haven't folded any block here, right?

reames updated this revision to Diff 380757.Oct 19 2021, 12:39 PM

Rebase over landed changes.

reames added inline comments.Oct 19 2021, 12:44 PM
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1873

Unfortunately, I don't think so. The problem is that this change could actually be changing the trip count. If we'd otherwise figured out this exit was never taken (and thus the loop was full out UB), this converts the loop into a finite one, changing the trip count in the process. This is legal (since the original cached value implied UB), but I don't think we can safely leave SCEV in a stale state.

We might be able to argue that we never cache the UB fact, but that makes me nervous since we've been generally teaching SCEV to exploit exactly that type of fact and even if we don't today, we might reasonable do so in the near future.

While writing this, I did have an idea on how to do cheaper invalidation here, but it requires the use list mechanism you were proposing for SCEV, and lazily invalidating trip counts. If you don't mind, I'll defer that to later.

reames updated this revision to Diff 380763.Oct 19 2021, 1:02 PM

Minor invalidation improvement. Only use the heavy weight hammer in the case which actually needs it.

mkazantsev accepted this revision.Oct 22 2021, 12:27 AM

LGTM, but I still think that "folded" is not a proper term here.

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1873

I don't mind that you forget the loop. My comment was about statement that we've folded something, when we didn't. Maybe I just misunderstand what "folded" meant, but it was about changing some condition to true/false, which is not the case here.

This revision is now accepted and ready to land.Oct 22 2021, 12:27 AM
This revision was landed with ongoing or failed builds.Oct 22 2021, 10:32 AM
This revision was automatically updated to reflect the committed changes.
reames added inline comments.Oct 22 2021, 10:40 AM
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1873

Oh! That's completely not what I thought you meant.

I actually agree. The phrasing "folded exit block" is odd. However, it's verbatim the comment used elsewhere in the file for the same case, and I figured consistency was better than perfection.

It's also worth noting this is majorly conservative if the outer loop *doesn't* actually share an exit block. We really should be walking out only a far as actually needed. But since we've got one major rewrite of SCEV invalidation in flight, I didn't want to start another at the same time.

spatel added a subscriber: spatel.Oct 25 2021, 6:23 AM

This patch causes an infinite loop in a compiled program:
https://llvm.org/PR52276

Here's an IR reduction of that example that shows what appears to be a miscompile with "opt -indvars" :

define i32 @PR52276(i32 %a, i32* %p) {
entry:
  %cmp = icmp sgt i32 %a, 0
  %conv = zext i1 %cmp to i32
  %neg = xor i32 %a, -1
  %cmp1 = icmp slt i32 %conv, %neg ; !!! This compare changes to ult
  br i1 %cmp1, label %ph, label %exit

ph:
  %b = load i32, i32* %p, align 4
  br label %loop

loop:
  %inc1 = phi i32 [ %b, %ph ], [ %inc, %loop ]
  %inc = add nsw i32 %inc1, 1
  br i1 %cmp1, label %loop, label %crit_edge, !llvm.loop !0

crit_edge:
  %inc2 = phi i32 [ %inc, %loop ]
  store i32 %inc2, i32* %p, align 4
  br label %exit

exit:
  ret i32 0
}

!0 = distinct !{!0, !1}
!1 = !{!"llvm.loop.mustprogress"}

This patch causes an infinite loop in a compiled program:
https://llvm.org/PR52276

Confirmed. The problem is a missing one-use check. The reasoning on the exit condition is valid, but the application of that to a use outside the loop is not. I'm going to revert and then reapply with the fix to make the history a bit easier to follow.

This patch causes an infinite loop in a compiled program:
https://llvm.org/PR52276

Confirmed. The problem is a missing one-use check. The reasoning on the exit condition is valid, but the application of that to a use outside the loop is not. I'm going to revert and then reapply with the fix to make the history a bit easier to follow.

Should be fixed in f82cf618.

FYI, the core must-exit logic from this patch was reverted in d4708fa4. See my response to my own revert on llvm-commits for explanation of why, and why I don't plan to reintroduce a fixed version.