This is an archive of the discontinued LLVM Phabricator instance.

[LoopUtils] Better accuracy for getLoopEstimatedTripCount.
ClosedPublic

Authored by ebrevnov on Dec 30 2019, 12:44 AM.

Details

Summary

Current implementation of getLoopEstimatedTripCount returns 1 iteration less than it should. The reason is that in bottom tested loop first iteration is executed before first back branch is taken. For example for loop with !{!"branch_weights", i32 1 taken, i32 1 exit} metadata getLoopEstimatedTripCount gives 1 while actual number of iterations is 2.

Diff Detail

Event Timeline

ebrevnov created this revision.Dec 30 2019, 12:44 AM
Ayal added inline comments.Dec 30 2019, 4:27 AM
llvm/lib/Transforms/Utils/LoopUtils.cpp
732

Would be better to separate the functional (+1) change from the variable name changes and divideNearest(). The latter is an NFC which is independent of and hides the former.

If the functional change goes first, it can e.g. simply do

+  uint64_t EstimatedBackedgeTakenCount;
  if (LatchBR->getSuccessor(0) == L->getHeader())
-    return (TrueVal + (FalseVal / 2)) / FalseVal;
+    EstimatedBackedgeTakenCount = (TrueVal + (FalseVal / 2)) / FalseVal;
  else
-    return (FalseVal + (TrueVal / 2)) / TrueVal;
+    EstimatedBackedgeTakenCount = (FalseVal + (TrueVal / 2)) / TrueVal;
+  return EstimatedBackedgeTakenCount + 1;

Can alternatively renamed original function to getEstimatedBackedgeTakenCount(), and introduced a new getLoopEstimatedTripCount() which returns the former plus one, if it's useful(?)

llvm/unittests/Transforms/Utils/CMakeLists.txt
18 ↗(On Diff #235561)

already there?

llvm/unittests/Transforms/Utils/LoopUtilsTest.cpp
1 ↗(On Diff #235561)

"This file was added." - this file already exists, right?
What is this trying to check on top of the above fixed lit tests?

Ayal added a comment.Dec 30 2019, 4:31 AM

Just pointing out for the record that this is extracted from D67905

ebrevnov updated this revision to Diff 235679.Dec 30 2019, 11:43 PM

Factor out NFC stuff + rebase

ebrevnov marked an inline comment as done.Jan 8 2020, 9:03 PM
Ayal added inline comments.Jan 9 2020, 12:49 AM
llvm/lib/Transforms/Utils/LoopUtils.cpp
726–727

If !LatchCycleWeight (if !BackedgeTakenWeight) it should return 1 instead of 0. I.e., should always be equal to "getEstimateBackedgeTakenCount()" plus 1.

If !LatchExitWeight, perhaps better return None instead of 0? This implies the loop was never entered when gathering the profile, or loop was entered once and never exited(?); in either case estimating the trip count to be zero seems wrong.

ebrevnov marked 2 inline comments as done.Jan 9 2020, 2:15 AM
ebrevnov added inline comments.
llvm/lib/Transforms/Utils/LoopUtils.cpp
726–727

Completely agree.

ebrevnov updated this revision to Diff 236990.Jan 9 2020, 2:21 AM
ebrevnov marked an inline comment as done.

Fixed raised issue + Rebase.

Ayal added inline comments.Jan 15 2020, 3:05 PM
llvm/lib/Transforms/Utils/LoopUtils.cpp
731

Maybe clearer to emphasize the +1 effect, noting that Trip count is one plus the estimated backedge taken count, and doing

uint64_t BackedgeTakenCount = llvm::divideNearest(BackedgeTakenWeight, LatchExitWeight);
return BackedgeTakenCount + 1;

(cf. InnerLoopVectorizer::getOrCreateTripCount(Loop *L))

llvm/test/Transforms/LoopUnroll/peel-loop-conditions-pgo-1.ll
14

It is indeed confusing at first sight, to see that the total number of estimated trip count is 2+5, given branch_weights of 6 and 1 (and thus estimated backedge taken count of 6/1). Perhaps worth clarifying in the above comment about "remained iterations". Should -unroll-peel-max-count=7 be updated?

Note that loop peeling already updates the estimated trip count (from 7 to 5 above), by calling Transforms/Utils/LoopUnrollPeel.cpp's fixupBranchWeights(); potential reuse with D67905's setLoopEstimatedTripCount().

llvm/unittests/Transforms/Utils/LoopUtilsTest.cpp
18 ↗(On Diff #236990)

These should all be in lex order.

1 ↗(On Diff #235561)

What is this trying to check on top of the above fixed lit tests?

This change seems mostly unrelated to the +1 accuracy fix, right? Suggest to commit it first, testing the original (backedge taken) value, and then modify it along with the +1 patch, similar to above fixes of lit tests.

ebrevnov marked 5 inline comments as done.Jan 15 2020, 8:06 PM
ebrevnov added inline comments.
llvm/lib/Transforms/Utils/LoopUtils.cpp
731

Sure, works for me. In fact, this +1 approach was used in original patch :-)

llvm/test/Transforms/LoopUnroll/peel-loop-conditions-pgo-1.ll
14

Should -unroll-peel-max-count=7 be updated?

Not sure I understand why it should be updated. Loop has 7 estimated iterations and we aim at peeling 2 iterations. Thus -unroll-peel-max-count=7 seems reasonable to me.

14

It is indeed confusing at first sight, to see that the total number of estimated trip count is 2+5, given branch_weights of 6 and 1 (and thus estimated backedge taken count of 6/1).

Even if I clarify here it won't help a lot since there are bunch of other similar places. Probably it's better to add explanation to getLoopEstimatedTripCount?

llvm/unittests/Transforms/Utils/LoopUtilsTest.cpp
18 ↗(On Diff #236990)

Will fix

1 ↗(On Diff #235561)

What is this trying to check on top of the above fixed lit tests?

I believe you requested to add such test at some time. Anyway it's better to have such test than not :-). Will extract this change to a separate review.

ebrevnov updated this revision to Diff 238454.Jan 16 2020, 4:03 AM

Moved regression test to a separate review.

Ayal added inline comments.Jan 16 2020, 8:36 AM
llvm/lib/Transforms/Utils/LoopUtils.cpp
729–734

"Trip count" >> "Estimated backedge taken count"

734

Suffice to say that Estimated trip count is one plus estimated backedge taken count.

llvm/test/Transforms/LoopUnroll/peel-loop-conditions-pgo-1.ll
14

OK; just noting that original test estimated the trip count to be 6, and used -unroll-peel-max-count=7, so seemed reasonable to bump this threshold (to 8) given that the estimation bumped (to 7).

14

OK; just suggesting to extend the existing "remained iterations" comment above, clarifying how many they are.

llvm/unittests/Transforms/Utils/LoopUtilsTest.cpp
1 ↗(On Diff #235561)

Sorry, my original request was just to make sure the +1 change is covered by tests, as opposed to NFC patches; the above lit tests seem fine.

ebrevnov updated this revision to Diff 238693.Jan 16 2020, 9:33 PM

Fixed comments.

Ayal accepted this revision.Jan 16 2020, 11:43 PM

This look good to me, with a last minor nit, thanks!

llvm/lib/Transforms/Utils/LoopUtils.cpp
730

"the the edge exiting weight" >> "the weight of the edge exiting the loop"

This revision is now accepted and ready to land.Jan 16 2020, 11:43 PM
This revision was automatically updated to reflect the committed changes.