This is analogous to D86156 (which preserves "lossy" BFI in loop passes). Lossy means that the analysis preserved may not be up to date with regards to new blocks that are added in loop passes, but BPI will not contain stale pointers to basic blocks that are deleted by the loop passes. This is achieved through BasicBlockCallbackVH in BPI, which calls eraseBlock that updates the data structures in BPI whenever a basic block is deleted. This patch does not have any changes in the upstream pipeline, since none of the loop passes in the pipeline use BPI currently. However, since BPI wasn't previously preserved in loop passes, the loop predication pass was invoking BPI *on the entire function* every time it ran in an LPM. This caused massive compile time in our downstream LPM invocation which contained loop predication. See updated test with an invocation of a loop-pipeline containing loop predication and -debug-pass turned ON.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
LGTM pending the added test case demonstrating that BPI is correctly preserved across loop passes. Also from D86156 we found there was a significant compiler speed up using LazyBlockFrequencyInfo for the passes that needed it. This may also help for LoopPredication's usage of BPI.
Thank you for the reviews. I will precommit the test case and update this diff. The point to note is that since Loop predication didn't preserve BPI, we were recalculating it every time for the NPM. We are primarily interested in the NPM, so I'll be adding the testcase for this along with -debug-only=branch-prob, which shows that before this change, we calculate BPI every time we call Loop predication.
The removed code in LoopPredication (which removes BPI calculation per Loop predication invocation) shows the testcase difference. I was hoping to keep the loop predication change separate to save some downstream merge headaches, but couldn't show a testcase for the standalone original change in this review.
I think it makes sense as part of this change given LoopStandardAnalysisResults now stores BPI, so calculating it separately in LoopPredication is a waste.
Make sure to update the description before pushing, thanks!
We need to explicitly check for loop-predication in PassBuilder, so that UseBranchProbabilityInfo is set true.
(Without this change profitability.ll test will fail since LPM invocation by default has UseBranchProbabilityInfo=false).
This is the only change in the diff and analogous to the UseBFI = any_pass = LICM check just above that.
llvm/lib/Passes/PassBuilder.cpp | ||
---|---|---|
1305 | Just for ease of review compared to previous LGTM'ed version: this is the (only) update done vs previous diff. |
llvm/test/Transforms/LoopPredication/preserve-bpi.ll | ||
---|---|---|
1 | Does this still need the requires before loop-mssa? |
llvm/test/Transforms/LoopPredication/preserve-bpi.ll | ||
---|---|---|
1 | We do not. Thanks. I'll update and land the patch. |
Just for ease of review compared to previous LGTM'ed version: this is the (only) update done vs previous diff.