This is an archive of the discontinued LLVM Phabricator instance.

LoopRotate: Add code to update branch weights
ClosedPublic

Authored by MatzeB on Aug 8 2023, 5:23 PM.

Details

Summary

While loop rotation updates the branch_weights correctly for the most part there were some branches missing runtime unroll. This adds code to the loop rotation transformation to ensure that the computed block execution counts for the loop bodies are the same before and after the transformation.

Invariants and heuristics are explained in the comment in the new updateBranchWeights() function.

Diff Detail

Event Timeline

MatzeB created this revision.Aug 8 2023, 5:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 8 2023, 5:23 PM
MatzeB requested review of this revision.Aug 8 2023, 5:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 8 2023, 5:23 PM
MatzeB added inline comments.Aug 8 2023, 5:25 PM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
274–285

I have an alternative version of this, but wasn't sure how people feel about using non-ASCII characters:

paulkirth added inline comments.Aug 10 2023, 1:20 PM
llvm/include/llvm/IR/ProfDataUtils.h
67–71 ↗(On Diff #548405)

Would you mind putting the changes to ProfDataUtils in a separate patch? I think these are good additions to the library, so I think it's better if they go in their own commit. Since it just renames an internal interface(practically speaking), it can probably be marked NFC/NFCI.

paulkirth added inline comments.Aug 10 2023, 1:44 PM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
294–309

This could probably be a helper function. It would also allow for expressive names for the Weight[0]/Weight[1] values, which will probably make it easier to follow.

Alternatively, you could group all of these counts into a struct. I'm not convinced that is worth it in this case, but it may make the code simpler to reason about and keep all the logic regarding the heuristic together.

This does come down to personal coding style, so don't consider these suggestions blocking.

MatzeB added inline comments.Aug 14 2023, 4:56 PM
llvm/include/llvm/IR/ProfDataUtils.h
67–71 ↗(On Diff #548405)

Extracted as D157937 now.

MatzeB updated this revision to Diff 550144.Aug 14 2023, 4:56 PM
MatzeB marked an inline comment as not done.
MatzeB added inline comments.
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
294–309

What's the point though? The only other thing here is the construction of two MDNodes. Feels needlessly complicated to introduce a struct and separate function just for that...

MatzeB updated this revision to Diff 550145.Aug 14 2023, 5:01 PM
MatzeB marked an inline comment as not done.
MatzeB added inline comments.
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
294–309

I did introduce two local variables now, that give a name to Weights[0] and Weights[1], hope that is enough to resolve this.

MatzeB updated this revision to Diff 550146.Aug 14 2023, 5:04 PM

I think the heuristics are better than doing nothing in theory. Though one other thing to note is that for AutoFDO, if the profiled build also has the loop rotated, the incoming counts won't be right to begin with -- if the two branches after rotate are executed A times and B times, the original branch should be executed A+B times, but AutoFDO will annotate it with max(A,B) times for the original branch.

You may want to double check that for loop rotate happening in pass1 profiling build, the heuristic also generates reasonable counts comparing to ground truth.

llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
292

did you mean y1 == y - y0 # how many iterations loop body was executed?

293

Yeah, this is what i meant by saying "loop rotate count is unfixable"..

305

Heuristic is somewhat arbitrary anyways but I would still keep ExitCount0 as non-zero, so it's a "live" path to be conservative. Also BFI doesn't like zero edge counts is another reason..

IIRC, mcf in spec has a function sort_basket which is known to be problematic for maintaining loop rotate counts, if you want to play with it a bit, especially the guessing part of the heuristic.

paulkirth added inline comments.Aug 15 2023, 7:26 AM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
294–309

Sorry, I could have worded that better as a "just a suggestion you can feel free to ignore". I tried to convey that at the end, but clearly missed my mark.

I was fine with the previous version, but those two thoughts came to mind, so I mentioned them. So nothing to resolve :)

What's the point though? The only other thing here is the construction of two MDNodes. Feels needlessly complicated to introduce a struct and separate function just for that...

Fair points. My thoughts were more about making this function easier to understand, but you're right that it probably isn't necessary and isn't that hard to follow right now.

Please feel free to ignore those suggestions, or restore the previous version if you'd rather.

MatzeB marked 2 inline comments as done.Aug 15 2023, 10:41 AM
MatzeB added inline comments.
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
305

I had that in earlier versions, but my experience while writing the tests anyway was that BFI handles zero-count-edges fine and the results just seem more natural and easier to reason about this way. While assigning non-zero immediately felt somewhat murky in what sort of "epsilon" should be used at all...

MatzeB updated this revision to Diff 550392.Aug 15 2023, 10:41 AM
wenlei added inline comments.Aug 15 2023, 12:07 PM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
305

Problem with BFI often gets exposed with larger CFG that has implicit constraints, so small test case doesn't always show. Is there any specific problem you ran into with having ExitCount0 as 1? 1 should be a reasonable guess too. There are many places where we try to avoid zero when uncertain, which is sort of an unspoken convention.

MatzeB added inline comments.Aug 15 2023, 1:04 PM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
305

The question when setting this to "1" is how you set the other case of the condbr. We can't just leave the other side as-is or risk ending up with ratios like 1:2 where the "1" clearly isn't a small number anymore.

But for putting something else here, do I go for 1:20, 1:100, 1:1000? I don't think I have a good feeling/guidance for that...

MatzeB updated this revision to Diff 550923.Aug 16 2023, 4:03 PM

Change code to avoid zero branch-weights.

MatzeB marked an inline comment as done.Aug 16 2023, 4:04 PM
MatzeB added inline comments.
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
305

I now added some scaling logic and the code attempts to hit a 1:128 or higher ratio for the "nearly never taken" edge.

MatzeB updated this revision to Diff 550925.Aug 16 2023, 4:06 PM
MatzeB marked an inline comment as done.
MatzeB updated this revision to Diff 551654.Aug 18 2023, 2:59 PM
wenlei added inline comments.Aug 22 2023, 3:56 PM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
311

nit: follow convention for variable naming.

312–313

Could OrigLoopBackedgeCount + OrigLoopExitCount already overflow and defeat the check?

Maybe spell it explicitly with OrigLoopBackedgeCount >= high_bit || OrigLoopExitCount >= high_bit?

MatzeB added inline comments.Aug 23 2023, 11:25 AM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
305

FWIW: Looked around and started a discussion on "likely"/"unlikely" branch weights in D158642 where I encountered the same problem.

MatzeB updated this revision to Diff 552834.Aug 23 2023, 11:58 AM
  • Fixed overflow test when scaling up weights.
  • Still need a better way than hardcoding 127 as "likely" branch weight; hoping that people can give me some hint in the discussion I started in https://reviews.llvm.org/D158642
MatzeB updated this revision to Diff 552835.Aug 23 2023, 11:58 AM
MatzeB added a reviewer: mtrofin.
MatzeB marked an inline comment as done.
MatzeB marked an inline comment as done.
MatzeB edited the summary of this revision. (Show Details)Aug 23 2023, 12:00 PM
MatzeB edited the summary of this revision. (Show Details)
wenlei accepted this revision.Aug 23 2023, 10:27 PM

I agree that hard coded probability is not great. But I also don't think there is a one-size-fit-all answer here. EH path can be unlikely, z-trip loop can be unlikely, [[unlikely]] hint can be unlikely, but they don't necessarily share the same probability.

I think something like InlineConstants, a global ProbabilityConstants may be an alternative.

But we can defer that part, and the change as is lgtm.

This revision is now accepted and ready to land.Aug 23 2023, 10:27 PM
mtrofin added a reviewer: xur.Aug 24 2023, 9:59 AM
mtrofin added inline comments.Aug 24 2023, 10:07 AM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
297

nit: can you init ExitWeight0 at declaration (to avoid potential joys due to uninitialization later). In fact, setting it to 0 and flipping the conditional below and not having an else would do it.

702

const bool?

MatzeB marked 2 inline comments as done.Aug 24 2023, 4:49 PM
MatzeB added inline comments.
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
297

This is deliberately left uninitialized! The idea is that every branch in the following ifs will set a value for it. Leaving the variable uninitialized here, means we will get a compiler warning if we forget to set the value on any branch below which I think is more valuable than setting some kind of safe default here but not getting a warning anymore.

702

Sure, will add. (though FWIW const could be applied to the majority of the variables and parameters in all of LLVM. I just have a feeling the existing code usually limits const usage to pointer and reference targets...)

mtrofin added inline comments.Aug 24 2023, 4:56 PM
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
253

nit: HasConditionalPreHeader.

297

Is it safe to rely on warnings though - is there at least a bot that treats uninitialized values as errors? But that aside, looks like 0 is the value when ConditionalPreHeader anyway, meaning 0 wouldn't be some safe default, rather, it's exactly what you'd want (assuming the if condition were swapped)

MatzeB marked 2 inline comments as done.Aug 25 2023, 12:02 PM
MatzeB added inline comments.
llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
297

Not sure if we have a -Werror bot, but the uninitialized variable warnings has always been part of -Wall. Also LLVM is warning free for most people (certainly is for me), new warning attract attention immediately. I have seen multiple instances in the past where someone added a new case to a complicated if-cascade but forgot to set a variables where the default was not suitable...

Probably not worth getting hung up on this particular one, so I'll change the default to = 0 and drop the ConditionalPreHeader case. But for the record: I disagree and certainly will not change other places in LLVM where I used similar patterns :)

MatzeB updated this revision to Diff 553603.Aug 25 2023, 1:40 PM
MatzeB updated this revision to Diff 553615.Aug 25 2023, 2:04 PM

lgtm. May be worth waiting for @xur 's feedback for these patches (he's out until Monday I think), he's spent quite some time in PGO land.

mtrofin accepted this revision.Aug 25 2023, 2:19 PM
MatzeB updated this revision to Diff 555529.Sep 1 2023, 4:11 PM

Factor out weights constant to match the style used in D158642

MatzeB marked an inline comment as done.Sep 1 2023, 4:12 PM

It seems there is no further feedback. Landing this now.

This revision was automatically updated to reflect the committed changes.