This is an archive of the discontinued LLVM Phabricator instance.

[indvars] Visit all IV users in simplifyAndExtend
AbandonedPublic

Authored by reames on Oct 20 2021, 12:32 PM.

Details

Summary

The basic idea of this patch is simple - we remove the restriction on only visiting users of addrecs from simplifyAndExtend, and instead allow users of any scevable instruction derived from an IV. Doing this exposes a couple of subtleties (explained below).

As seen in the test diffs, this does help eliminate some code which wasn't previously handled, but the primary value is making some of the code run in Scalar/IndVarSimplify.cpp after simplifyAndEtend effectively dead. (To be removed in separate patches with their own review.)

The first subtlety here is that additional simplification can actually lead to inferior results - due to missing invalidation. The case that happens is we fold an icmp in simplifyAndExtend, but leave the SCEV tripcount for that loop as an could-not-compute. Following code, specifically predicateLoops, then can't run. Without this change, we'd instead hit optimizeLoopExits which would invalidate, and thus loop predication would run.

To address this, we must invalidate when we simplify a loop exit condition. We could just always invalidate, but that could require O(LoopSize) rebuilds of SCEV for the loop. From prior work on strengthenNoWrapFlags, we know that is sometimes prohibitive. This patch only invalidates on loop exit condition changes, so the worse case should be O(LoopExits) rebuilds. In theory, you could have an adversarial loop when number exits ~= loop size, but in practice, the number of loop exits should be much smaller.

@nikic Could you confirm this doesn't have an excessive compile time?

The second subtly shows up in the loop-unroll test change. In this case, unrolling produces *worse* codegen for an up to date SCEV trip count. We have a dead exit, which was proven untaken and folded. With stale info, unroll recognizes that the cached trip count (10) is dead, and folds the branch. With up to date info (could-not-compute, e.g. never taken), unroll does not look at the actual CFG and leaves the branch unfolded. This is in my view an uninteresting quirk of the unroll implementation, and not something worth fixing.

Diff Detail

Event Timeline

reames created this revision.Oct 20 2021, 12:32 PM
reames requested review of this revision.Oct 20 2021, 12:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 20 2021, 12:32 PM
reames planned changes to this revision.Oct 21 2021, 9:51 AM

The impact is pretty substantial.

Crud. That's definitely too high for the minor code reduction benefit.

I've got one more idea on how to do coarser invalidation once after simplifyAndExtend, let me iterate on this and get a new version up. If that doesn't work, we'll have to simply drop the idea entirely.

reames abandoned this revision.Feb 12 2022, 8:17 AM