This is an archive of the discontinued LLVM Phabricator instance.

[LoopPredication] Calculate profitability without BPI
ClosedPublic

Authored by anna on Oct 12 2021, 12:13 PM.

Details

Summary

Using BPI within loop predication is non-trivial because BPI is only
preserved lossily in loop pass manager (one fix exposed by lossy
preservation is up for review at D111448). However, since loop
predication is only used in downstream pipelines, it is hard to keep BPI
from breaking for incomplete state with upstream changes in BPI.
Also, correctly preserving BPI for all loop passes is a non-trivial
undertaking (D110438 does this lossily), while the benefit of using it
in loop predication isn't clear.

In this patch, we rely on profile metadata to get almost similar benefit as
BPI, without actually using the complete heuristics provided by BPI.
This avoids the compile time explosion we tried to fix with D110438 and
also avoids fragile bugs because BPI can be lossy in loop passes
(D111448).

Diff Detail

Event Timeline

anna created this revision.Oct 12 2021, 12:13 PM
anna requested review of this revision.Oct 12 2021, 12:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 12 2021, 12:13 PM
anna updated this revision to Diff 380020.Oct 15 2021, 8:39 AM

minor changes in naming, clang format. NFC w.r.t. previous version.

Looks reasonable, assuming you're seeing similar results to using BPI. I'm not very familiar with BPI itself, so if others reviewers are, I'll let them chime in.

Does it make sense to make isValidProfileData and ComputeBranchProbability static utilities in BranchProbability? (nit: consistent case-ing on the names)

ebrevnov added inline comments.
llvm/lib/Transforms/Scalar/LoopPredication.cpp
971

This looks like a natural generalization of "bool Instruction::extractProfMetadata(uint64_t &TrueVal, uint64_t &FalseVal) const". Why don't we extend existing implementation of "extractProfMetadata" to handle more than 2 operands and overloaded existing API with "bool Instruction::extractProfMetadata(SmallVectorImpl<uint64_t> &) const"?

PS: Looks tempting to place total weight as a first element in the resulting vector. WDYT?

977

Looks like we already check that there is a "valid" profile at line 968. Why do it again?

anna added inline comments.Oct 19 2021, 5:43 AM
llvm/lib/Transforms/Scalar/LoopPredication.cpp
971

The SmallVector won't buy us what we need. After retrieving those individual profile data, we still need to calculate the Numerator by checking if (Term->getSuccessor(i) == ExitBlock), i.e. what we do at line 983.

Also, btw, we have Instruction::extractProfTotalWeight, which computes what I have in Denominator, but we anyway need to compute the Numerator so not sure if it's worth it.

977

Because the first check is for Latch terminators, while this one is for all branches exiting the loop. See usage of ComputeBranchProbability below.

anna added a comment.Oct 19 2021, 5:46 AM

Looks reasonable, assuming you're seeing similar results to using BPI. I'm not very familiar with BPI itself, so if others reviewers are, I'll let them chime in.

Does it make sense to make isValidProfileData and ComputeBranchProbability static utilities in BranchProbability? (nit: consistent case-ing on the names)

I think that's a good idea. Will do as a follow-up after landing patch.

Will fix the naming.

anna updated this revision to Diff 380667.Oct 19 2021, 6:04 AM

addressed review comment about naming. Updated a comment in the testcase to parse better.

apilipenko accepted this revision.Oct 19 2021, 9:51 AM
This revision is now accepted and ready to land.Oct 19 2021, 9:51 AM
This revision was landed with ongoing or failed builds.Oct 19 2021, 11:25 AM
This revision was automatically updated to reflect the committed changes.
ebrevnov added inline comments.Oct 19 2021, 10:30 PM
llvm/lib/Transforms/Scalar/LoopPredication.cpp
971

This looks like a natural generalization of "bool Instruction::extractProfMetadata(uint64_t &TrueVal, uint64_t &FalseVal) const". Why don't we extend existing implementation of "extractProfMetadata" to handle more than 2 operands and overloaded existing API with "bool Instruction::extractProfMetadata(SmallVectorImpl<uint64_t> &) const"?

PS: Looks tempting to place total weight as a first element in the resulting vector. WDYT?

971

The SmallVector won't buy us what we need. After retrieving those individual profile data, we still need to calculate the Numerator by checking if (Term->getSuccessor(i) == ExitBlock), i.e. what we do at line 983.

Sure, we will still need to iterate other the vector to extract required weight. But all the code responsible for parsing "branch_weight" metadata will move to a single place. If representation of "branch_weight" metadata is changed one day there will be a single place to adjust.

Also, btw, we have Instruction::extractProfTotalWeight, which computes what I have in Denominator, but we anyway need to compute the Numerator so not sure if it's worth it.

I don't see reasons to call Instruction::extractProfTotalWeight in this case either. My proposal was to include total weight as the first element of the vector populated by "bool Instruction::extractProfMetadata(SmallVectorImpl<uint64_t> &) const". The motivation is simple, the weight itself is mostly useless because it is relative to total weight. Thus we always need total weight as well.

977

Makes sense. Overlooked second call.