This is an archive of the discontinued LLVM Phabricator instance.

[LoopPeeling] Better handling of branch weights for small values
AbandonedPublic

Authored by skatkov on Jun 28 2019, 12:05 AM.

Details

Summary

If the original exit weight for latch to exit is small the new
weights for copies of latch will be non-informative like
[0,1] if original exit weight is 1.

This CL tries to scale original exit weight to get reasonable values.

Diff Detail

Event Timeline

skatkov created this revision.Jun 28 2019, 12:05 AM
reames requested changes to this revision.Jun 28 2019, 3:12 PM
reames added inline comments.
lib/Transforms/Utils/LoopUnrollPeel.cpp
392

Are you based on a stale version? I don't have this function in this file on ToT.

402

This comment is concerning. This appears to be assigning meaning to the numeric value of the branch weights. The only meaning to the branch weights is in their *ratio*.

It looks like you're trying to extend a broken scheme, rather than fixing it. Please don't. :)

An alternative here would be to compute a header weight based on 1 + the estimated trip count, then decrement by 1 on each iteration. I think this would both be simpler, more stable, and likely get the same result on all of your existing test cases.

This revision now requires changes to proceed.Jun 28 2019, 3:12 PM
skatkov marked 2 inline comments as done.Jul 1 2019, 1:17 AM
skatkov added inline comments.
lib/Transforms/Utils/LoopUnrollPeel.cpp
392
402

:) It is a tough topic :)

Let me try first to explain how I understand the current scheme works.

There is a latch. Say A is a weight to go to header and B is a weight to go to exit.
So A/(A+B) is a probability to go to the loop header.
The current scheme looks at it in a different angle. If during execution of the code we come to latch A+B times then A times we will go to the header and B times we go to the exit. It is evident that number of time we come to pre-header should be also B.
So it is similar to gathering a profile. So weight is a number of times we take the corresponding edge.

After the peeling we want to set the branch weights on latch and copies of the latch such as if we come to the first copy of the header B times then

  1. Number of times we come to header is B (sum of all weights of edges coming to exit is B)
  2. Number of times we take all copies of loop body is A
  3. Each individual pair of weights for copy of latch is some reasonable distribution so that the probability to go to exit on the next iteration growth and almost 1 on the last iteration.

See comment for updateBranchWeights function where some invariants are described.

I do not say it is best idea but in this patch I tried to avoid modification it but a bit adjust to behaves as it was expected when the values for weights are small.

Now about your proposal.
As I understand you propose to set branch weights for i-th peeling iteration as (TC + 1 - i, 1), am I correct?
I'm fine with that but probably it is better to get feedback from the author of current approach.

According to git history it seems it is Michael Kuperstein, Michael do you see any problems with this modification/simplification?

@mukper, Can you please comment on my last comment?

skatkov marked an inline comment as done.Jul 4 2019, 9:54 PM
skatkov added inline comments.
lib/Transforms/Utils/LoopUnrollPeel.cpp
402

The patch implemented this idea is uploaded as https://reviews.llvm.org/D64235