This is an archive of the discontinued LLVM Phabricator instance.

[LoopPredication] Optimize two exits case.
AbandonedPublic

Authored by ebrevnov on Jan 30 2020, 5:29 AM.

Details

Summary

Current implementation of loop exits optimization in LoopPredication uses minimum over all analyzable exits. For two exits case we don't actually need to take a minimum and can compare exit counts directly.

Legality: Currently for two exits case we generate if (P1) deopt, where P1:=(ec1 <= min(ec1, ec2)). Since "min(ec1, ec2) <= ec2" always holds thus if(P2) deopt, where P2:=(ec1 <= ec2) always holds if original condition holds. In other words transformed condition will always deopt if original deopted.

Profitability: If ec1<=ec2 then both P1 and P2 are true. If ec1 > ec2, then both P1 and P2 are false. That means that these two forms are identical.

Event Timeline

ebrevnov created this revision.Jan 30 2020, 5:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2020, 5:29 AM
ebrevnov updated this revision to Diff 241969.Feb 2 2020, 10:04 PM

Another update

ebrevnov updated this revision to Diff 241970.Feb 2 2020, 10:08 PM

Yet another update

Harbormaster completed remote builds in B45548: Diff 241970.
ebrevnov retitled this revision from [WIP][LoopPredication] Optimize two exits case. to [LoopPredication] Optimize two exits case..Feb 2 2020, 10:37 PM
ebrevnov edited the summary of this revision. (Show Details)
reames requested changes to this revision.Feb 25 2020, 10:04 AM

If I'm reading the description and code correctly, you're basically trying to avoid generating the construct "a == umin(a, b)" right? If so, what's the problem with generating that? I would expect other transform passes (such as instcombine) to very happily rip that apart into the component pieces. In fact, I see in instcombine the transform "foldICmpWithMinMax" which appears to do exactly that. Why isn't that sufficient?

In fact, I took the predicate-exits.ll test file, ran it through loop-pred and then instcombine, and I see exactly the simplification expected. This seems to result in the same form as your change, so why do we need the complexity here?

This revision now requires changes to proceed.Feb 25 2020, 10:04 AM
ebrevnov abandoned this revision.Sep 30 2020, 9:29 PM

It turned out that in the targeted case there is the following expression min(a,b) = min(min(a,c), b) where rhs min is reassociated. Due to this foldICmpWithMinMax can't handle it. I've uploaded a series of 3 patched for NaryReassociate (https://reviews.llvm.org/D88285, https://reviews.llvm.org/D88286, https://reviews.llvm.org/D88287) which converts that to min(a,b) = min(min(a,b), c). With that in place foldICmpWithMinMax does its job.

Abandoned.