Page MenuHomePhabricator

[BPI] Keep BPI available in loop passes through LoopStandardAnalysisResults
ClosedPublic

Authored by anna on Sep 24 2021, 11:59 AM.

Details

Summary
This is analogous to D86156 (which preserves "lossy" BFI in loop
passes). Lossy means that the analysis preserved may not be up to date
with regards to new blocks that are added in loop passes, but BPI will
not contain stale pointers to basic blocks that are deleted by the loop
passes.

This is achieved through BasicBlockCallbackVH in BPI, which calls
eraseBlock that updates the data structures in BPI whenever a basic
block is deleted.

This patch does not have any changes in the upstream pipeline, since
none of the loop passes in the pipeline use BPI currently.
However, since BPI wasn't previously preserved in loop passes, the loop
predication pass was invoking BPI *on the entire
function* every time it ran in an LPM.  This caused massive compile time
in our downstream LPM invocation which contained loop predication.

See updated test with an invocation of a loop-pipeline containing loop
predication and -debug-pass turned ON.

Diff Detail

Unit TestsFailed

TimeTest
100 msx64 debian > ORC-x86_64-linux.TestCases/Linux/x86-64::trivial-cxa-atexit.S
Script: -- : 'RUN: at line 3'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -m64 -c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/orc/X86_64LinuxConfig/TestCases/Linux/x86-64/Output/trivial-cxa-atexit.S.tmp /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/orc/TestCases/Linux/x86-64/trivial-cxa-atexit.S
100 msx64 debian > ORC-x86_64-linux.TestCases/Linux/x86-64::trivial-static-initializer.S
Script: -- : 'RUN: at line 7'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -m64 -c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/orc/X86_64LinuxConfig/TestCases/Linux/x86-64/Output/trivial-static-initializer.S.tmp /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/orc/TestCases/Linux/x86-64/trivial-static-initializer.S
100 msx64 debian > ORC-x86_64-linux.TestCases/Linux/x86-64::trivial-tls.S
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -m64 -c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/orc/X86_64LinuxConfig/TestCases/Linux/x86-64/Output/trivial-tls.S.tmp /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/orc/TestCases/Linux/x86-64/trivial-tls.S

Event Timeline

anna created this revision.Sep 24 2021, 11:59 AM
anna requested review of this revision.Sep 24 2021, 11:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 24 2021, 11:59 AM

LGTM pending the added test case demonstrating that BPI is correctly preserved across loop passes. Also from D86156 we found there was a significant compiler speed up using LazyBlockFrequencyInfo for the passes that needed it. This may also help for LoopPredication's usage of BPI.

asbirlea accepted this revision.Sep 27 2021, 9:07 AM

Changes LGTM. Please add the testcase mentioned in the description before pushing.

This revision is now accepted and ready to land.Sep 27 2021, 9:07 AM
anna added a comment.Sep 27 2021, 1:14 PM

Thank you for the reviews. I will precommit the test case and update this diff. The point to note is that since Loop predication didn't preserve BPI, we were recalculating it every time for the NPM. We are primarily interested in the NPM, so I'll be adding the testcase for this along with -debug-only=branch-prob, which shows that before this change, we calculate BPI every time we call Loop predication.

anna updated this revision to Diff 375566.Sep 28 2021, 7:15 AM

updated precommited testcase and BPI invocation removed from LoopPredication.

anna added a comment.Sep 28 2021, 7:19 AM

The removed code in LoopPredication (which removes BPI calculation per Loop predication invocation) shows the testcase difference. I was hoping to keep the loop predication change separate to save some downstream merge headaches, but couldn't show a testcase for the standalone original change in this review.

modimo accepted this revision.Sep 28 2021, 12:03 PM

The removed code in LoopPredication (which removes BPI calculation per Loop predication invocation) shows the testcase difference. I was hoping to keep the loop predication change separate to save some downstream merge headaches, but couldn't show a testcase for the standalone original change in this review.

I think it makes sense as part of this change given LoopStandardAnalysisResults now stores BPI, so calculating it separately in LoopPredication is a waste.

Make sure to update the description before pushing, thanks!

anna edited the summary of this revision. (Show Details)Sep 28 2021, 1:00 PM
anna updated this revision to Diff 375678.Sep 28 2021, 1:03 PM

We need to explicitly check for loop-predication in PassBuilder, so that UseBranchProbabilityInfo is set true.
(Without this change profitability.ll test will fail since LPM invocation by default has UseBranchProbabilityInfo=false).

This is the only change in the diff and analogous to the UseBFI = any_pass = LICM check just above that.

anna added inline comments.Sep 28 2021, 1:04 PM
llvm/lib/Passes/PassBuilder.cpp
1305 ↗(On Diff #375678)

Just for ease of review compared to previous LGTM'ed version: this is the (only) update done vs previous diff.

asbirlea added inline comments.Sep 28 2021, 3:47 PM
llvm/test/Transforms/LoopPredication/preserve-bpi.ll
1 ↗(On Diff #375678)

Does this still need the requires before loop-mssa?

anna added inline comments.Sep 30 2021, 5:56 AM
llvm/test/Transforms/LoopPredication/preserve-bpi.ll
1 ↗(On Diff #375678)

We do not. Thanks. I'll update and land the patch.

This revision was landed with ongoing or failed builds.Sep 30 2021, 7:27 AM
This revision was automatically updated to reflect the committed changes.