Page MenuHomePhabricator

[Loop Peeling] Fix the handling of branch weights of peeled off branches
ClosedPublic

Authored by skatkov on Jul 4 2019, 9:53 PM.

Details

Summary

Current algorithm to update branch weights of latch block and its copies is
based on the assumption that number of peeling iterations is approximately equal
to trip count.

However it is not correct. According to profitability check in one case we can decide to peel
in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration
can be less then estimated trip count.

This patch introduces another way to set the branch weights to peeled of branches.
Let F is a weight of the edge from latch to header.
Let E is a weight of the edge from latch to exit.
F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit.
Then, Estimated TripCount = F / E.
For I-th (counting from 0) peeled off iteration we set the the weights for
the peeled latch as (TC - I, 1). It gives us reasonable distribution,
The probability to go to exit 1/(TC-I) increases. At the same time
the estimated trip count of remaining loop reduces by I.

As a result after peeling off N iteration the weights will be
(F - N * E, E) and trip count of loop becomes
F / E - N or TC - N.

The idea is taken from the review of the patch https://reviews.llvm.org/D63918
proposed by Philip.

Diff Detail

Repository
rL LLVM

Event Timeline

skatkov created this revision.Jul 4 2019, 9:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2019, 9:53 PM
skatkov updated this revision to Diff 210275.Jul 17 2019, 2:32 AM
skatkov retitled this revision from [Loop Peeling] Another way to handle branch weights of peeled off branches to [Loop Peeling] Fix the handling of branch weights of peeled off branches.
skatkov edited the summary of this revision. (Show Details)
reames requested changes to this revision.Jul 17 2019, 8:58 AM
reames added inline comments.
llvm/lib/Transforms/Utils/LoopUnrollPeel.cpp
367 ↗(On Diff #210275)

Wording here is confusing. I can infer the scheme from the rest of the comment, but I don't know what "dynamic iteration" means here.

I think you mean that F/(F+E) is the frequency with which we take the backedge, and that E/(F+E) is the frequency with which we exit the loop.

419 ↗(On Diff #210275)

branch_weights are *frequencies* not *# of times executed*.

i.e. 1/2 is the same as 500/1000 and should always produce the same results.

The code appears to do so, but the comments are slightly misleading.

426 ↗(On Diff #210275)

Style: Use early return. The existing code gets this wrong, so I'd encourage an NFC which changes that, and then a rebase. It somewhat matters here as the case where FallThroughWeight is 0 is interesting, and deserves a comment and/or an assert.

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt.ll
76 ↗(On Diff #210275)

If I'm reading this test correctly, the result is not making sense.

We have a loop (for.body) with a side exit branch frequency of 100% (%17) and a latch frequency of ~3 (%16).

If we peel this twice, we should still have a very rarely taken side exit, but the frequencies appear to be saving that's more common in the peeled copy? (The test naming is hard to follow, so I might just be reading the test wrong.)

This revision now requires changes to proceed.Jul 17 2019, 8:58 AM
skatkov planned changes to this revision.Jul 17 2019, 9:46 PM
skatkov marked an inline comment as done.

Hi Philip, Thank you for review. I'll update a patch.

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt.ll
76 ↗(On Diff #210275)

All this weights are about latch. Weight of side-exit remains untouched, so should be the same (1,0).

I'll update the test to make it visible.

skatkov updated this revision to Diff 210518.Jul 18 2019, 3:12 AM
skatkov edited the summary of this revision. (Show Details)

Please, take a look.

reames accepted this revision.Jul 19 2019, 10:40 AM

LGTM. Nice job on clarifying the test.

This revision is now accepted and ready to land.Jul 19 2019, 10:40 AM
This revision was automatically updated to reflect the committed changes.