This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
ScalarEvolution.h
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
3/7
IndVarSimplify.cpp
-
test/Transforms/IndVarSimplify/
-
Transforms/
-
IndVarSimplify/
2/3
finite-exit-comparisons.ll

Differential D111836

[indvars] Use fact loop must exit to canonicalize to unsigned conditions
ClosedPublic

Authored by reames on Oct 14 2021, 1:53 PM.

Download Raw Diff

Details

Reviewers

mkazantsev
fhahn
efriedma
nikic

Commits

rG412eb07edd4e: [indvars] Use fact loop must exit to canonicalize to unsigned conditions

Summary

The logic in this patch is that if we find a comparison which would be unsigned except for when the loop is infinite, and we can prove that an infinite loop must be ill defined, we can still make the predicate unsigned.

The eventual goal (combined with a follow on patch) is to use the fact the loop exits to remove the zext (see tests) entirely.

A couple of points worth noting:

We loose the ability to prove the loop unreachable by committing to the must exit interpretation. If instead, we later proved that rhs was definitely outside the range required for finiteness, we could have killed the loop entirely. (We don't currently implement this transform, but could in theory, do so.)
simplifyAndExtend has a very limited list of users it walks. In particular, in the examples is stops at the zext and never visits the icmp. (Because we can't fold the zext to an addrec yet in SCEV.) Being willing to visit when we haven't simplified regresses multiple tests (seemingly because of less optimal results when computing trip counts).
D109457 is geared at the same basic problem, but handles a non-fully overlapping set of cases.

Here's an Alive2 proof of a related property:

declare void @llvm.assume(i1)

define i1 @src(i32 noundef %x, i16 noundef %y) {

%and = and i32 %x, 65532
%add = add i32 %and, -1
%assume = icmp ne i32 %add, -1
call void @llvm.assume(i1 %assume)
%zext = zext i16 %y to i32
%res = icmp sgt i32 %add, %zext
ret i1 %res

}

define i1 @tgt(i32 noundef %x, i16 noundef %y) {

%and = and i32 %x, 65532
%add = add i32 %and, -1
%assume = icmp ne i32 %add, -1
call void @llvm.assume(i1 %assume)
%zext = zext i16 %y to i32
%res = icmp ugt i32 %add, %zext
ret i1 %res

}

(In this patch, the must exit property provides the assumption in the alive2 example. You can also vary the predicate type to see it holds for other signed predicates)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

reames created this revision.Oct 14 2021, 1:53 PM

Herald added subscribers: javed.absar, bollu, hiraditya, mcrosier. · View Herald TranscriptOct 14 2021, 1:53 PM

reames requested review of this revision.Oct 14 2021, 1:53 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2021, 1:53 PM

reames mentioned this in D109457: [SCEV] Use constant range of RHS to prove NUW on narrow IV in trip count logic.Oct 14 2021, 1:54 PM

reames edited the summary of this revision. (Show Details)Oct 14 2021, 1:57 PM

reames added a subscriber: tgt.

I don't consider mustprogress-based transforms worthwhile...

In D111836#3065366, @nikic wrote:

I don't consider mustprogress-based transforms worthwhile...

Why?

Harbormaster completed remote builds in B128948: Diff 379831.Oct 14 2021, 3:18 PM

mkazantsev added inline comments.Oct 15 2021, 12:10 AM

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp

1424

Looks like a great place to use pattern matching.

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll

367

Why does it need mustprogress notion to do this at all? Zext is non-negative, 254 is non-negative, we can always do that regardless of mustprogress. We have this logic in eliminateIVComparison:

} else if (ICmpInst::isSigned(OriginalPred) &&
           SE->isKnownNonNegative(S) && SE->isKnownNonNegative(X)) {
  // If we were unable to make anything above, all we can is to canonicalize
  // the comparison hoping that it will open the doors for other
  // optimizations. If we find out that we compare two non-negative values,
  // we turn the instruction's predicate to its unsigned version. Note that
  // we cannot rely on Pred here unless we check if we have swapped it.
  assert(ICmp->getPredicate() == OriginalPred && "Predicate changed?");
  LLVM_DEBUG(dbgs() << "INDVARS: Turn to unsigned comparison: " << *ICmp
                    << '\n');
  ICmp->setPredicate(ICmpInst::getUnsignedPredicate(OriginalPred));

Why it didn't work?

Some time ago i addempted to do this in CVP via constant range reasoning:
D90924 (ignore the SCEV part)
Would that be worthwhile? Should i pick that back up?

reames added inline comments.Oct 15 2021, 9:03 AM

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll
367	See the comment in code about why this can't be done in simplifyAndExtend. Short version: we never visit the icmp due to the presence of the zext, and if we do, we get overall worse results due to poisoning of SCEV caches. I tried that approach first, and gave up after getting tangled up in SCEV dance around partially constructed SCEVs and trip count logic.

reames mentioned this in D111896: [indvars] Canonicalize exit conditions to unsigned using range info.Oct 15 2021, 9:29 AM

reames added inline comments.Oct 15 2021, 9:36 AM

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1424	Its not. :) I need to mutate the condition of the ICmp below, and the pattern matching code is just more complicated than what is here. I tried it; it's ugly. :)
llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll
367	I'd always planned to handle this in a follow up, but since you asked: D111896. (We still have to work around the fact simplifyAndExtend never visits the icmp and avoid poisoning SCEV with sub-optimal results when we do.)

In D111836#3066152, @lebedev.ri wrote:

Some time ago i addempted to do this in CVP via constant range reasoning:
D90924 (ignore the SCEV part)
Would that be worthwhile? Should i pick that back up?

Yes, I think a CVP based form would be useful. We have the range information, we might as well use it.

If you want to implement the CVP part, I'd be happy to review. Alternatively, I can implement, and you can review. Your preference.

In D111836#3067101, @reames wrote:

In D111836#3066152, @lebedev.ri wrote:

Some time ago i addempted to do this in CVP via constant range reasoning:
D90924 (ignore the SCEV part)
Would that be worthwhile? Should i pick that back up?

Yes, I think a CVP based form would be useful. We have the range information, we might as well use it.

If you want to implement the CVP part, I'd be happy to review. Alternatively, I can implement, and you can review. Your preference.

(CVP isn't the hard part, actually modelling it in ConstantRange is.)
I guess i just need to rebase the patch then.

In D111836#3067107, @lebedev.ri wrote:

In D111836#3067101, @reames wrote:

In D111836#3066152, @lebedev.ri wrote:

Some time ago i addempted to do this in CVP via constant range reasoning:
D90924 (ignore the SCEV part)
Would that be worthwhile? Should i pick that back up?

Yes, I think a CVP based form would be useful. We have the range information, we might as well use it.

If you want to implement the CVP part, I'd be happy to review. Alternatively, I can implement, and you can review. Your preference.

(CVP isn't the hard part, actually modelling it in ConstantRange is.)
I guess i just need to rebase the patch then.

Why? The "both are positive" check should be pretty trivial given the two constant ranges for the operands? Are you looking to go much beyond that? If so, maybe split the changes?

mkazantsev added inline comments.Oct 17 2021, 10:59 PM

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1442	`isSigned && isRelational`?

Address review comments

Harbormaster completed remote builds in B129404: Diff 380485.Oct 18 2021, 12:27 PM

mkazantsev added inline comments.Oct 19 2021, 8:19 AM

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1873	I have a deja vu. :) We haven't folded any block here, right?

Rebase over landed changes.

reames added inline comments.Oct 19 2021, 12:44 PM

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1873	Unfortunately, I don't think so. The problem is that this change could actually be changing the trip count. If we'd otherwise figured out this exit was never taken (and thus the loop was full out UB), this converts the loop into a finite one, changing the trip count in the process. This is legal (since the original cached value implied UB), but I don't think we can safely leave SCEV in a stale state. We might be able to argue that we never cache the UB fact, but that makes me nervous since we've been generally teaching SCEV to exploit exactly that type of fact and even if we don't today, we might reasonable do so in the near future. While writing this, I did have an idea on how to do cheaper invalidation here, but it requires the use list mechanism you were proposing for SCEV, and lazily invalidating trip counts. If you don't mind, I'll defer that to later.

Minor invalidation improvement. Only use the heavy weight hammer in the case which actually needs it.

Harbormaster completed remote builds in B129599: Diff 380763.Oct 19 2021, 1:40 PM

reames added a child revision: D112262: [indvars] Rotate zext though icmp to reduce loop varying computation.Oct 21 2021, 12:33 PM

LGTM, but I still think that "folded" is not a proper term here.

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1873	I don't mind that you forget the loop. My comment was about statement that we've folded something, when we didn't. Maybe I just misunderstand what "folded" meant, but it was about changing some condition to true/false, which is not the case here.

This revision is now accepted and ready to land.Oct 22 2021, 12:27 AM

This revision was landed with ongoing or failed builds.Oct 22 2021, 10:32 AM

Closed by commit rG412eb07edd4e: [indvars] Use fact loop must exit to canonicalize to unsigned conditions (authored by reames). · Explain Why

This revision was automatically updated to reflect the committed changes.

reames added a commit: rG412eb07edd4e: [indvars] Use fact loop must exit to canonicalize to unsigned conditions.

reames added inline comments.Oct 22 2021, 10:40 AM

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1873	Oh! That's completely not what I thought you meant. I actually agree. The phrasing "folded exit block" is odd. However, it's verbatim the comment used elsewhere in the file for the same case, and I figured consistency was better than perfection. It's also worth noting this is majorly conservative if the outer loop doesn't actually share an exit block. We really should be walking out only a far as actually needed. But since we've got one major rewrite of SCEV invalidation in flight, I didn't want to start another at the same time.

This patch causes an infinite loop in a compiled program:
https://llvm.org/PR52276

Here's an IR reduction of that example that shows what appears to be a miscompile with "opt -indvars" :

define i32 @PR52276(i32 %a, i32* %p) {
entry:
  %cmp = icmp sgt i32 %a, 0
  %conv = zext i1 %cmp to i32
  %neg = xor i32 %a, -1
  %cmp1 = icmp slt i32 %conv, %neg ; !!! This compare changes to ult
  br i1 %cmp1, label %ph, label %exit

ph:
  %b = load i32, i32* %p, align 4
  br label %loop

loop:
  %inc1 = phi i32 [ %b, %ph ], [ %inc, %loop ]
  %inc = add nsw i32 %inc1, 1
  br i1 %cmp1, label %loop, label %crit_edge, !llvm.loop !0

crit_edge:
  %inc2 = phi i32 [ %inc, %loop ]
  store i32 %inc2, i32* %p, align 4
  br label %exit

exit:
  ret i32 0
}

!0 = distinct !{!0, !1}
!1 = !{!"llvm.loop.mustprogress"}

In D111836#3084204, @spatel wrote:

This patch causes an infinite loop in a compiled program:
https://llvm.org/PR52276

Confirmed. The problem is a missing one-use check. The reasoning on the exit condition is valid, but the application of that to a use outside the loop is not. I'm going to revert and then reapply with the fix to make the history a bit easier to follow.

In D111836#3084709, @reames wrote:

In D111836#3084204, @spatel wrote:

This patch causes an infinite loop in a compiled program:
https://llvm.org/PR52276

Confirmed. The problem is a missing one-use check. The reasoning on the exit condition is valid, but the application of that to a use outside the loop is not. I'm going to revert and then reapply with the fix to make the history a bit easier to follow.

Should be fixed in f82cf618.

lebedev.ri mentioned this in D90924: [ConstantRange] Sign-flipping of signedness-invariant comparisons.Oct 30 2021, 6:07 AM

lebedev.ri mentioned this in D112895: [CVP] Canonicalize signed relational comparisons of scalar integers to unsigned comparison predicates.Oct 31 2021, 1:27 PM

lebedev.ri mentioned this in rGb554e41e2d15: [CVP] Canonicalize signed relational comparisons of scalar integers to unsigned….Nov 1 2021, 2:16 AM

FYI, the core must-exit logic from this patch was reverted in d4708fa4. See my response to my own revert on llvm-commits for explanation of why, and why I don't plan to reintroduce a fixed version.

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

ScalarEvolution.h

20 lines

lib/

Transforms/

Scalar/

IndVarSimplify.cpp

45 lines

test/

Transforms/

IndVarSimplify/

finite-exit-comparisons.ll

8 lines

Diff 381596

llvm/include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 1,156 Lines • ▼ Show 20 Lines	public:
/// Update no-wrap flags of an AddRec. This may drop the cached info about		/// Update no-wrap flags of an AddRec. This may drop the cached info about
/// this AddRec (such as range info) in case if new flags may potentially		/// this AddRec (such as range info) in case if new flags may potentially
/// sharpen it.		/// sharpen it.
void setNoWrapFlags(SCEVAddRecExpr *AddRec, SCEV::NoWrapFlags Flags);		void setNoWrapFlags(SCEVAddRecExpr *AddRec, SCEV::NoWrapFlags Flags);

/// Try to apply information from loop guards for \p L to \p Expr.		/// Try to apply information from loop guards for \p L to \p Expr.
const SCEV applyLoopGuards(const SCEV Expr, const Loop *L);		const SCEV applyLoopGuards(const SCEV Expr, const Loop *L);

		/// Return true if the loop has no abnormal exits. That is, if the loop
		/// is not infinite, it must exit through an explicit edge in the CFG.
		/// (As opposed to either a) throwing out of the function or b) entering a
		/// well defined infinite loop in some callee.)
		bool loopHasNoAbnormalExits(const Loop *L) {
		return getLoopProperties(L).HasNoAbnormalExits;
		}

		/// Return true if this loop is finite by assumption. That is,
		/// to be infinite, it must also be undefined.
		bool loopIsFiniteByAssumption(const Loop *L);

private:		private:
/// A CallbackVH to arrange for ScalarEvolution to be notified whenever a		/// A CallbackVH to arrange for ScalarEvolution to be notified whenever a
/// Value is deleted.		/// Value is deleted.
class SCEVCallbackVH final : public CallbackVH {		class SCEVCallbackVH final : public CallbackVH {
ScalarEvolution *SE;		ScalarEvolution *SE;

void deleted() override;		void deleted() override;
void allUsesReplacedWith(Value *New) override;		void allUsesReplacedWith(Value *New) override;
▲ Show 20 Lines • Show All 304 Lines • ▼ Show 20 Lines	private:

/// Return a \c LoopProperties instance for \p L, creating one if necessary.		/// Return a \c LoopProperties instance for \p L, creating one if necessary.
LoopProperties getLoopProperties(const Loop *L);		LoopProperties getLoopProperties(const Loop *L);

bool loopHasNoSideEffects(const Loop *L) {		bool loopHasNoSideEffects(const Loop *L) {
return getLoopProperties(L).HasNoSideEffects;		return getLoopProperties(L).HasNoSideEffects;
}		}

bool loopHasNoAbnormalExits(const Loop *L) {
return getLoopProperties(L).HasNoAbnormalExits;
}

/// Return true if this loop is finite by assumption. That is,
/// to be infinite, it must also be undefined.
bool loopIsFiniteByAssumption(const Loop *L);

/// Compute a LoopDisposition value.		/// Compute a LoopDisposition value.
LoopDisposition computeLoopDisposition(const SCEV S, const Loop L);		LoopDisposition computeLoopDisposition(const SCEV S, const Loop L);

/// Memoized computeBlockDisposition results.		/// Memoized computeBlockDisposition results.
DenseMap<		DenseMap<
const SCEV *,		const SCEV *,
SmallVector<PointerIntPair<const BasicBlock *, 2, BlockDisposition>, 2>>		SmallVector<PointerIntPair<const BasicBlock *, 2, BlockDisposition>, 2>>
BlockDispositions;		BlockDispositions;
▲ Show 20 Lines • Show All 702 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp

Show First 20 Lines • Show All 1,415 Lines • ▼ Show 20 Lines	bool IndVarSimplify::canonicalizeExitCondition(Loop *L) {
// never reaches the icmp since the zext doesn't fold to an AddRec unless		// never reaches the icmp since the zext doesn't fold to an AddRec unless
// it already has flags. The alternative to this would be to extending the		// it already has flags. The alternative to this would be to extending the
// set of "interesting" IV users to include the icmp, but doing that		// set of "interesting" IV users to include the icmp, but doing that
// regresses results in practice by querying SCEVs before trip counts which		// regresses results in practice by querying SCEVs before trip counts which
// rely on them which results in SCEV caching sub-optimal answers. The		// rely on them which results in SCEV caching sub-optimal answers. The
// concern about caching sub-optimal results is why we only query SCEVs of		// concern about caching sub-optimal results is why we only query SCEVs of
// the loop invariant RHS here.		// the loop invariant RHS here.
SmallVector<BasicBlock*, 16> ExitingBlocks;		SmallVector<BasicBlock*, 16> ExitingBlocks;
L->getExitingBlocks(ExitingBlocks);		L->getExitingBlocks(ExitingBlocks);
		mkazantsevUnsubmitted Not Done Reply Inline Actions Looks like a great place to use pattern matching. mkazantsev: Looks like a great place to use pattern matching.
		reamesAuthorUnsubmitted Done Reply Inline Actions Its not. :) I need to mutate the condition of the ICmp below, and the pattern matching code is just more complicated than what is here. I tried it; it's ugly. :) reames: Its not. :) I need to mutate the condition of the ICmp below, and the pattern matching code…
bool Changed = false;		bool Changed = false;
for (auto *ExitingBB : ExitingBlocks) {		for (auto *ExitingBB : ExitingBlocks) {
auto *BI = dyn_cast<BranchInst>(ExitingBB->getTerminator());		auto *BI = dyn_cast<BranchInst>(ExitingBB->getTerminator());
if (!BI)		if (!BI)
continue;		continue;
assert(BI->isConditional() && "exit branch must be conditional");		assert(BI->isConditional() && "exit branch must be conditional");

auto *ICmp = dyn_cast<ICmpInst>(BI->getCondition());		auto *ICmp = dyn_cast<ICmpInst>(BI->getCondition());
if (!ICmp)		if (!ICmp)
continue;		continue;

auto *LHS = ICmp->getOperand(0);		auto *LHS = ICmp->getOperand(0);
auto *RHS = ICmp->getOperand(1);		auto *RHS = ICmp->getOperand(1);
// Avoid computing SCEVs in the loop to avoid poisoning cache with		// For the range reasoning, avoid computing SCEVs in the loop to avoid
// sub-optimal results.		// poisoning cache with sub-optimal results. For the must-execute case,
		// this is a neccessary precondition for correctness.
if (!L->isLoopInvariant(RHS))		if (!L->isLoopInvariant(RHS))
continue;		continue;
		mkazantsevUnsubmitted Not Done Reply Inline Actions `isSigned && isRelational`? mkazantsev: `isSigned && isRelational`?

// Match (icmp signed-cond zext, RHS)		// Match (icmp signed-cond zext, RHS)
Value *LHSOp = nullptr;		Value *LHSOp = nullptr;
if (!match(LHS, m_ZExt(m_Value(LHSOp))) \|\| !ICmp->isSigned())		if (!match(LHS, m_ZExt(m_Value(LHSOp))) \|\| !ICmp->isSigned())
continue;		continue;

const DataLayout &DL = ExitingBB->getModule()->getDataLayout();		const DataLayout &DL = ExitingBB->getModule()->getDataLayout();
const unsigned InnerBitWidth = DL.getTypeSizeInBits(LHSOp->getType());		const unsigned InnerBitWidth = DL.getTypeSizeInBits(LHSOp->getType());
const unsigned OuterBitWidth = DL.getTypeSizeInBits(RHS->getType());		const unsigned OuterBitWidth = DL.getTypeSizeInBits(RHS->getType());
auto FullCR = ConstantRange::getFull(InnerBitWidth);		auto FullCR = ConstantRange::getFull(InnerBitWidth);
FullCR = FullCR.zeroExtend(OuterBitWidth);		FullCR = FullCR.zeroExtend(OuterBitWidth);
if (!FullCR.contains(SE->getUnsignedRange(SE->getSCEV(RHS))))		if (FullCR.contains(SE->getUnsignedRange(SE->getSCEV(RHS)))) {
		// We have now matched icmp signed-cond zext(X), zext(Y'), and can thus
		// replace the signed condition with the unsigned version.
		ICmp->setPredicate(ICmp->getUnsignedPredicate());
		Changed = true;
		// Note: No SCEV invalidation needed. We've changed the predicate, but
		// have not changed exit counts, or the values produced by the compare.
continue;		continue;
		}

		// If we have a loop which would be undefined if infinite, and it has at
		// most one possible dynamic exit, then we can conclude that exit must
		// be taken. If that exit must be taken, and we know the LHS can only
		// take values in the positive domain, then we can conclude RHS must
		// also be in that same range, and replace a signed compare with an
		// unsigned one.
		// If the exit might not be taken in a well defined program.
		if (ExitingBlocks.size() == 1 && SE->loopHasNoAbnormalExits(L) &&
		SE->loopIsFiniteByAssumption(L)) {
// We have now matched icmp signed-cond zext(X), zext(Y'), and can thus		// We have now matched icmp signed-cond zext(X), zext(Y'), and can thus
// replace the signed condition with the unsigned version.		// replace the signed condition with the unsigned version.
ICmp->setPredicate(ICmp->getUnsignedPredicate());		ICmp->setPredicate(ICmp->getUnsignedPredicate());
Changed = true;		Changed = true;

		// Given we've changed exit counts, notify SCEV.
		// Some nested loops may share same folded exit basic block,
		// thus we need to notify top most loop.
		SE->forgetTopmostLoop(L);
		continue;
		}
}		}
return Changed;		return Changed;
}		}

bool IndVarSimplify::optimizeLoopExits(Loop *L, SCEVExpander &Rewriter) {		bool IndVarSimplify::optimizeLoopExits(Loop *L, SCEVExpander &Rewriter) {
SmallVector<BasicBlock*, 16> ExitingBlocks;		SmallVector<BasicBlock*, 16> ExitingBlocks;
L->getExitingBlocks(ExitingBlocks);		L->getExitingBlocks(ExitingBlocks);

▲ Show 20 Lines • Show All 369 Lines • ▼ Show 20 Lines	if (int Rewrites = rewriteLoopExitValues(L, LI, TLI, SE, TTI, Rewriter, DT,
NumReplaced += Rewrites;		NumReplaced += Rewrites;
Changed = true;		Changed = true;
}		}
}		}

// Eliminate redundant IV cycles.		// Eliminate redundant IV cycles.
NumElimIV += Rewriter.replaceCongruentIVs(L, DT, DeadInsts);		NumElimIV += Rewriter.replaceCongruentIVs(L, DT, DeadInsts);

if (canonicalizeExitCondition(L))		// Try to convert exit conditions to unsigned
// We've changed the predicate, but have not changed exit counts, or the		// Note: Handles invalidation internally if needed.
// values which can flow through any SCEV. i.e, no invalidation needed.		Changed \|= canonicalizeExitCondition(L);
Changed = true;

// Try to eliminate loop exits based on analyzeable exit counts		// Try to eliminate loop exits based on analyzeable exit counts
		mkazantsevUnsubmitted Not Done Reply Inline Actions I have a deja vu. :) We haven't folded any block here, right? mkazantsev: I have a deja vu. :) We haven't folded any block here, right?
		reamesAuthorUnsubmitted Done Reply Inline Actions Unfortunately, I don't think so. The problem is that this change could actually be changing the trip count. If we'd otherwise figured out this exit was never taken (and thus the loop was full out UB), this converts the loop into a finite one, changing the trip count in the process. This is legal (since the original cached value implied UB), but I don't think we can safely leave SCEV in a stale state. We might be able to argue that we never cache the UB fact, but that makes me nervous since we've been generally teaching SCEV to exploit exactly that type of fact and even if we don't today, we might reasonable do so in the near future. While writing this, I did have an idea on how to do cheaper invalidation here, but it requires the use list mechanism you were proposing for SCEV, and lazily invalidating trip counts. If you don't mind, I'll defer that to later. reames: Unfortunately, I don't think so. The problem is that this change could actually be changing…
		mkazantsevUnsubmitted Not Done Reply Inline Actions I don't mind that you forget the loop. My comment was about statement that we've folded something, when we didn't. Maybe I just misunderstand what "folded" meant, but it was about changing some condition to true/false, which is not the case here. mkazantsev: I don't mind that you forget the loop. My comment was about statement that we've folded…
		reamesAuthorUnsubmitted Done Reply Inline Actions Oh! That's completely not what I thought you meant. I actually agree. The phrasing "folded exit block" is odd. However, it's verbatim the comment used elsewhere in the file for the same case, and I figured consistency was better than perfection. It's also worth noting this is majorly conservative if the outer loop doesn't actually share an exit block. We really should be walking out only a far as actually needed. But since we've got one major rewrite of SCEV invalidation in flight, I didn't want to start another at the same time. reames: Oh! That's completely not what I thought you meant. I actually agree. The phrasing "folded…
if (optimizeLoopExits(L, Rewriter)) {		if (optimizeLoopExits(L, Rewriter)) {
Changed = true;		Changed = true;
// Given we've changed exit counts, notify SCEV		// Given we've changed exit counts, notify SCEV
// Some nested loops may share same folded exit basic block,		// Some nested loops may share same folded exit basic block,
// thus we need to notify top most loop.		// thus we need to notify top most loop.
SE->forgetTopmostLoop(L);		SE->forgetTopmostLoop(L);
}		}

▲ Show 20 Lines • Show All 194 Lines • Show Last 20 Lines

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll

	Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines
	define void @slt_non_constant_rhs(i16 %n) mustprogress {			define void @slt_non_constant_rhs(i16 %n) mustprogress {
	; CHECK-LABEL: @slt_non_constant_rhs(			; CHECK-LABEL: @slt_non_constant_rhs(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
	; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1
	; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16			; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16
	; CHECK-NEXT: [[CMP:%.]] = icmp slt i16 [[ZEXT]], [[N:%.]]			; CHECK-NEXT: [[CMP:%.]] = icmp ult i16 [[ZEXT]], [[N:%.]]
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body: ; preds = %entry, %for.body			for.body: ; preds = %entry, %for.body
	▲ Show 20 Lines • Show All 242 Lines • ▼ Show 20 Lines
	define void @sgt_constant_rhs(i16 %n.raw, i8 %start) mustprogress {			define void @sgt_constant_rhs(i16 %n.raw, i8 %start) mustprogress {
	; CHECK-LABEL: @sgt_constant_rhs(			; CHECK-LABEL: @sgt_constant_rhs(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ [[START:%.]], [[ENTRY:%.]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ [[START:%.]], [[ENTRY:%.]] ]
	; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1
	; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16			; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16
	; CHECK-NEXT: [[CMP:%.*]] = icmp ugt i16 [[ZEXT]], 254			; CHECK-NEXT: [[CMP:%.*]] = icmp ugt i16 [[ZEXT]], 254
				mkazantsevUnsubmitted Not Done Reply Inline Actions Why does it need mustprogress notion to do this at all? Zext is non-negative, 254 is non-negative, we can always do that regardless of mustprogress. We have this logic in `eliminateIVComparison`: } else if (ICmpInst::isSigned(OriginalPred) && SE->isKnownNonNegative(S) && SE->isKnownNonNegative(X)) { // If we were unable to make anything above, all we can is to canonicalize // the comparison hoping that it will open the doors for other // optimizations. If we find out that we compare two non-negative values, // we turn the instruction's predicate to its unsigned version. Note that // we cannot rely on Pred here unless we check if we have swapped it. assert(ICmp->getPredicate() == OriginalPred && "Predicate changed?"); LLVM_DEBUG(dbgs() << "INDVARS: Turn to unsigned comparison: " << ICmp << '\n'); ICmp->setPredicate(ICmpInst::getUnsignedPredicate(OriginalPred)); Why it didn't work? mkazantsev:* Why does it need mustprogress notion to do this at all? Zext is non-negative, 254 is non…
				reamesAuthorUnsubmitted Done Reply Inline Actions See the comment in code about why this can't be done in simplifyAndExtend. Short version: we never visit the icmp due to the presence of the zext, and if we do, we get overall worse results due to poisoning of SCEV caches. I tried that approach first, and gave up after getting tangled up in SCEV dance around partially constructed SCEVs and trip count logic. reames: See the comment in code about why this can't be done in simplifyAndExtend. Short version: we…
				reamesAuthorUnsubmitted Done Reply Inline Actions I'd always planned to handle this in a follow up, but since you asked: D111896. (We still have to work around the fact simplifyAndExtend never visits the icmp and avoid poisoning SCEV with sub-optimal results when we do.) reames: I'd always planned to handle this in a follow up, but since you asked: D111896. (We still have…
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body: ; preds = %entry, %for.body			for.body: ; preds = %entry, %for.body
	Show All 10 Lines
	define void @sgt_non_constant_rhs(i16 %n) mustprogress {			define void @sgt_non_constant_rhs(i16 %n) mustprogress {
	; CHECK-LABEL: @sgt_non_constant_rhs(			; CHECK-LABEL: @sgt_non_constant_rhs(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
	; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1
	; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16			; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16
	; CHECK-NEXT: [[CMP:%.]] = icmp sgt i16 [[ZEXT]], [[N:%.]]			; CHECK-NEXT: [[CMP:%.]] = icmp ugt i16 [[ZEXT]], [[N:%.]]
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body: ; preds = %entry, %for.body			for.body: ; preds = %entry, %for.body
	Show All 37 Lines
	define void @sle_non_constant_rhs(i16 %n) mustprogress {			define void @sle_non_constant_rhs(i16 %n) mustprogress {
	; CHECK-LABEL: @sle_non_constant_rhs(			; CHECK-LABEL: @sle_non_constant_rhs(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
	; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1
	; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16			; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16
	; CHECK-NEXT: [[CMP:%.]] = icmp sle i16 [[ZEXT]], [[N:%.]]			; CHECK-NEXT: [[CMP:%.]] = icmp ule i16 [[ZEXT]], [[N:%.]]
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body: ; preds = %entry, %for.body			for.body: ; preds = %entry, %for.body
	Show All 37 Lines
	define void @sge_non_constant_rhs(i16 %n) mustprogress {			define void @sge_non_constant_rhs(i16 %n) mustprogress {
	; CHECK-LABEL: @sge_non_constant_rhs(			; CHECK-LABEL: @sge_non_constant_rhs(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ [[IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
	; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1
	; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16			; CHECK-NEXT: [[ZEXT:%.*]] = zext i8 [[IV_NEXT]] to i16
	; CHECK-NEXT: [[CMP:%.]] = icmp sge i16 [[ZEXT]], [[N:%.]]			; CHECK-NEXT: [[CMP:%.]] = icmp uge i16 [[ZEXT]], [[N:%.]]
	; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.body			br label %for.body

	for.body: ; preds = %entry, %for.body			for.body: ; preds = %entry, %for.body
	▲ Show 20 Lines • Show All 225 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[indvars] Use fact loop must exit to canonicalize to unsigned conditionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 381596

llvm/include/llvm/Analysis/ScalarEvolution.h

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp

llvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll

[indvars] Use fact loop must exit to canonicalize to unsigned conditions
ClosedPublic