This is an archive of the discontinued LLVM Phabricator instance.

[LoopPred/WC] Use a dominating widenable condition to remove analyze loop exits
ClosedPublic

Authored by reames on Nov 4 2019, 3:34 PM.

Details

Summary

This implements a version of the predicateLoopExits transform from IndVarSimplify extended to exploit widenable conditions - and thus be much wider in scope of legality. The code structure ends up being almost entirely different, so I chose to duplicate this into the LoopPredication pass instead of trying to reuse the code in the IndVars.

The core notions of the transform are as follows:

  1. If we have a widenable condition which controls entry into the loop, we're allowed to widen it arbitrarily. Given that, it's simply a *profitability* question as to what conditions to fold into the widenable branch.
  2. To avoid pass ordering issues, we want to avoid widening cases that would otherwise be dischargeable. Or, widen in a form which can still be discharged. Thus, we phrase the transform as selecting one analyzeable exit from the set of analyzeable exits to keep. This avoids creating pass ordering complexities.
  3. Since none of the above proves that we actually exit through our analyzeable exits - we might exit through something else entirely - we limit ourselves to cases where a) the latch is analyzeable and b) the latch is predicted taken, and c) the exit being removed is statically cold.

Diff Detail

Event Timeline

reames created this revision.Nov 4 2019, 3:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 4 2019, 3:34 PM
reames updated this revision to Diff 227794.Nov 4 2019, 3:39 PM

Make the code a bit more explicit about checking property 3 in the review description.

simoll added a subscriber: simoll.Nov 5 2019, 12:25 AM

In general the code looks fine to me. What is not clear how it is intended to work on real case scenarios. On the one hand we rely on WC having explicit form in the IR (ExitBB->getTerminatingDeoptimizeCall()) on the other hand we skip optimizing exits over WC since they are not analyzable. Are you going to introduce that support in the next patch or I'm missing something here?

llvm/lib/Transforms/Scalar/LoopPredication.cpp
1025

is closer is->is closer in

1083

Ill formatted comment...not trivial to understand either...I would suggest to remove it....or improve...

llvm/test/Transforms/LoopPredication/predicate-exits.ll
685

I believe you meant this is something to handle in future, right?

fedor.sergeev added inline comments.Nov 12 2019, 5:45 AM
llvm/lib/Transforms/Scalar/LoopPredication.cpp
965

typo - WideNable

1127–1128

I have not been following this 'freeze' story, but freeze instruction is already in:

commit 58acbce3def63a207b8f5a69318a99666a4aac53
Author: aqjune <aqjune@gmail.com>
Date:   Tue Nov 5 15:53:22 2019 +0900
    [IR] Add Freeze instruction

Perhaps you want to rephrase this TODO ...

1200–1206

IMO it would look more natural would you remove this if() check, leaving two guard loops on the same level as predicateLoopExits call.
There should be no extra efficiency gained from this (superflous) check.

BTW, please run clang-format...I noticed some trailing spaces and miss-formatting

ebrevnov added inline comments.Nov 13 2019, 4:29 AM
llvm/lib/Transforms/Scalar/LoopPredication.cpp
1108

Would it be better to check profitability in a more general way using BPI or similar?

fedor.sergeev added inline comments.Nov 13 2019, 5:44 AM
llvm/lib/Transforms/Scalar/LoopPredication.cpp
1108

Since widenable branch jumping to deoptimize is the main intended usecase for widenable branch itself
(we even show it in LangRef as a prime example of widenable.condition usage)
it seems to be a very good special case to start with.

BPI is tricky to use (and tricky to preserve), so it can not be the main source of truth for profitability purposes.

ebrevnov added inline comments.Nov 13 2019, 9:34 AM
llvm/lib/Transforms/Scalar/LoopPredication.cpp
1108

First I don't see anything tricky in using BPI. It is widely used across optimizer and if BPI is "incorrect" that may affect lot of places. Moreover LoopPredication has explicit dependence on BPI and won't work as expected with "broken" BPI anway. Also note that current implementation doesn't handle widenable branches because such branches are not analyzable.

fedor.sergeev added inline comments.Nov 13 2019, 1:39 PM
llvm/lib/Transforms/Scalar/LoopPredication.cpp
1108

LoopPredication has explicit dependence on BPI and won't work as expected with "broken" BPI anway.

I'm not talking about "broken BPI".
In new PassManager BPI is not always preserved through the loop passes, so it can be just "absent".

And yes, current profitability checks do depend on BPI, yet LoopPredication will just consider loop as unconditionally profitable in absence of BPI.

reames marked 3 inline comments as done.Nov 14 2019, 11:24 AM

Patch refresh coming to incorporate comments.

In general the code looks fine to me. What is not clear how it is intended to work on real case scenarios. On the one hand we rely on WC having explicit form in the IR (ExitBB->getTerminatingDeoptimizeCall()) on the other hand we skip optimizing exits over WC since they are not analyzable. Are you going to introduce that support in the next patch or I'm missing something here?

I'm confused by your confusion.

This patch widens checks in the loop (not involving widenable conditions) into a check before the loop (involving a widenable condition). It does not widen widenable conditions *within* a loop.

We will hoist and unswitch branches within a loop based on widenable conditions. As such, it seems reasonable to believe widenable conditions can be usually found outside of loops.

llvm/lib/Transforms/Scalar/LoopPredication.cpp
1108

I would strongly prefer not to introduce BPI here. Having an deoptimize call is a very strong signal of cold code. BPI doesn't really have a corresponding notion of "almost definitely cold code".

(I'm not opposed to trying to generalize, I just don't want to do it now. There are other parts I consider more important.)

1127–1128

It hadn't at the point I posted the patch. I will add support when I refresh patch.

1200–1206

Hm, you're right. Will do.

reames updated this revision to Diff 229380.Nov 14 2019, 12:33 PM

Address review comments.

This revision is now accepted and ready to land.Nov 18 2019, 9:04 AM
This revision was automatically updated to reflect the committed changes.