This is an archive of the discontinued LLVM Phabricator instance.

[indvars] Canonicalize exit conditions to unsigned using range info
ClosedPublic

Authored by reames on Oct 15 2021, 9:29 AM.

Details

Summary

This is a companion patch to D111836. They share a bunch of common code, and whichever one lands second will be rebased over the other.

This patch duplicates a bit of logic we apply to comparisons encountered during the IV users walk to conditions which feed exit conditions. Why? simplifyAndExtend has a very limited list of users it walks. In particular, in the examples is stops at the zext and never visits the icmp. (Because we can't fold the zext to an addrec yet in SCEV.) Being willing to visit when we haven't simplified regresses multiple tests (seemingly because of less optimal results when computing trip counts).

Note that this can be trivially extended to multiple exiting blocks. I'm leaving that to a future patch (solely to cut down on the number of versions of the same code in review at once.)

Diff Detail

Event Timeline

reames created this revision.Oct 15 2021, 9:29 AM
reames requested review of this revision.Oct 15 2021, 9:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 15 2021, 9:29 AM
nikic added a subscriber: nikic.Oct 15 2021, 10:58 AM

My main question here would be why we need this in IndVars. This doesn't really seem to be related to IV or loop based reasoning, but is a generic range based transform. Even InstCombine can handle this (based on known bits): https://godbolt.org/z/drGfM4K9o And as Roman mentioned on the other review, CVP could do this based on ranges as well.

Now, if this case simply fell out of existing code (i.e. if the "visiting more users" approach worked) I'd understand, but it's not super clear to me why we need to go out of the way to handle this here. Is this addressing some kind of phase ordering issue?

My main question here would be why we need this in IndVars. This doesn't really seem to be related to IV or loop based reasoning, but is a generic range based transform. Even InstCombine can handle this (based on known bits): https://godbolt.org/z/drGfM4K9o And as Roman mentioned on the other review, CVP could do this based on ranges as well.

Now, if this case simply fell out of existing code (i.e. if the "visiting more users" approach worked) I'd understand, but it's not super clear to me why we need to go out of the way to handle this here. Is this addressing some kind of phase ordering issue?

Well, the motivation for doing this here is that the companion patch *is* inherently loop based. It feels silly to leave cases on the floor just because we might get them elsewhere when we already have a need for a loop based transform in exactly this place.

mkazantsev added inline comments.Oct 18 2021, 4:20 AM
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1428

Loop exiting branch cannot be unconditional.

1445

isRelational && isSigned ?

1848

Do we really need this? We don't fold any blocks here, just replacing signed predicate with unsigned. How can this be a problem for caches?

reames updated this revision to Diff 380472.Oct 18 2021, 10:48 AM

address review comments

reames added inline comments.Oct 18 2021, 10:50 AM
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1848

We don't, though for somewhat subtle reasons. In theory, switching the predicate could let us compute a trip count, but since we haven't *changed* the trip count of the loop, at worst we've left SCEV in an imprecise state.

I removed the invalidation for now, we may want to add it back if that imprecision turns out to be a problem later.

mkazantsev accepted this revision.Oct 18 2021, 9:48 PM

Looks fine now, thanks! Suggestion for follow-up in comments.

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1441

Can we end up with non-canonicalized comparison here? I.e. LHS is invariant and RHS is not. If so, we can swap them. OK if done in follow-up.

This revision is now accepted and ready to land.Oct 18 2021, 9:48 PM

I went ahead and pushed a follow on to this without review. 0836a105 extends this code to handle multiple exiting blocks. The change is trivial; as noted in the review description here, it was left out only to make keeping a couple of reviews in sync reasonable.

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
1441

I think we can. To be honest, I'd rather handle that with a general canonicalization towards invariant on rhs, but that'll require a bit more thought. I won't get to this immediately, but I wrote it down to come back to in a few days.

Now, if this case simply fell out of existing code (i.e. if the "visiting more users" approach worked) I'd understand, but it's not super clear to me why we need to go out of the way to handle this here. Is this addressing some kind of phase ordering issue?

I went back and took another look at the simplifyAndExtend option. I think I managed to get that working; we'll see if the compile time is reasonable. See https://reviews.llvm.org/D112170.

Assuming that works, we'll have some code to cleanup in Scalar/IndVarSimplify.cpp. Some of this transform will disappear, but so will a good amount of the existing code in optimizeLoopExits.