If a PHI node out of extracted region has multiple incoming values from it, split this PHI on two parts. First PHI has incomings only from region and extracts with it (they are placed to the separate basic block that added to the list of outlined), and incoming values in original PHI are replaced by first PHI. Similar solution is already used in CodeExtractor for PHIs in entry block (severSplitPHINodes method). It covers PR39433 bug.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Thank you (very much!) for working on this.
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
619 ↗ | (On Diff #175751) | Nit: please capitalize sentences in comments. Also, maybe "incoming values from the outlining region" would be clearer? |
625 ↗ | (On Diff #175751) | Why should PHIs with exactly one predecessor from the outlining region be skipped? What if there's a second exit block with another, different predecessor from the outlining region? Out1 -> Out2 \ \ Exit1 Exit2 |
1301 ↗ | (On Diff #175751) | Why delete this assertion? ISTM that's it's still useful to have around. |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
635 ↗ | (On Diff #175751) | Why not "for (BasicBlock &PredBB : predecessors(ExitBB))"? |
650 ↗ | (On Diff #175751) | When you replace all uses of ExitBB with NewBB for every terminator in a predecessor of ExitBB, every new PHI inserted into NewBB must have *some* incoming value (just undef?) from *each* predecessor of NewBB. Otherwise, the verifier complains (taken from a stage2 build with hot/cold splitting enabled): PHINode should have one entry for each predecessor of its parent basic block! %conv97.lcssa106.ce = phi i64 [ 1, %if.then31.1 ], [ 1, %if.then31.1 ], [ 2, %if.then31.2 ], [ 2, %if.then31.2 ] fatal error: error in backend: verification of newFunction failed! |
@kachkov98 thank you for helping to solve this bug. I agree with @vsk 's review comments and I'm most concerned about skipping PHIs with only one predecessor. I think it would be good to add code coverage testing in Utils/CodeExtractorTest.cpp explicitly testing your assumptions of the new code in severSplitPHINodesOfExits()
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
619 ↗ | (On Diff #175751) | +1 Also please end sentences with a period . |
625 ↗ | (On Diff #175751) | Agreed, I think this needs a little more thought. At the very least it needs a comment justifying why these PHIs should not be processed. |
Fix comments style
Return assertion back
Verify function consistency before outlining
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
625 ↗ | (On Diff #175751) | My intention here was to not create unnecessary output parameters when this single value is constant (if we split it to separate phi, findInputsOutputs() detects that PHI value is used outside the region and creates new argument). If this value is not constant, it will be stored right after its definition. In this example one value will be stored in one argument somewhere in out1, another value will be stored in another argument in out2, than they both are reloaded in codeRepl block, but PHI in exit1 (or exit2) uses only one value from region, so it should be determenistic. I completely agree that this assumption requires additional test, I'll try to provide it soon. |
635 ↗ | (On Diff #175751) | The reason is that iterators become invalidated when terminator instuction in predecessor is updated |
650 ↗ | (On Diff #175751) | This error message is quite ambigious) I think it means here that PHI shouldn't have duplicated incoming blocks (for example, 2 %if.then31.1 blocks are not correct - how to choose value in that case?). Nevertheless, severSplitPHINodesOfExits just copies these incoming blocks from original PHI, and it means that this PHI was incorrect *before* the extraction. I can confirm that already have this failure while compiling test-suite with HotColdSplitting and PGO, and checking that function is consistent before any extraction resolved it. This check can be done in CodeExtractor (lines 1191-1195), but root cause is that some pass creates this PHI before. |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
1192 ↗ | (On Diff #175893) | This should be wrapped in LLVM_DEBUG(), as we don't want to pay the compile-time cost for this in Release builds There are build bots which test llvm using -verify-each (inserting the verifier after each pass), so (at least theoretically) we should safely be able to assume the input IR is correct. |
635 ↗ | (On Diff #175751) | Actually, as-written, there is an iterator invalidation bug here. This fixes it: SmallVector<BasicBlock *, 4> Preds(pred_begin(ExitBB), pred_end(ExitBB)); for (BasicBlock *PredBB : Preds) { ... } This handles the case where one (or more) of the predecessor blocks is terminated by a switch, which can have multiple uses of ExitBB. The current version of the code walks the wrong user list when this happens. |
650 ↗ | (On Diff #175751) | The verifier allows duplicate incoming blocks (as long as they provide a unique incoming value). This isn't a bug in another pass -- I hit the same verification failure with your most recent patch, which verifies oldFunction before doing any work. The suggested iterator invalidation fix in my comment above addresses the problem. |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
625 ↗ | (On Diff #175751) | I think I follow your explanation. Here's a test: entry: br i1 undef, label %extract-me-1, label %exit-1 ; Extracted blocks extract-me-1: br i1 undef, label %exit-1, label %extract-me-2 extract-me-2: br label %exit-2 ; Blocks that are not extracted exit-1: %p1 = phi [ i8 0, entry ], [ i8 1, extract-me-1 ] br label %exit-2 exit-2: %p2 = phi [ i8 2, entry ], [ i8 1, exit-1 ] ret void I suppose there is no need to split %p1 or %p2, because there's only one possible incoming value from the codeRepl block to either exit-1 or exit-2. |
650 ↗ | (On Diff #175751) | By the way, the in-tree version of the hot/cold splitting pass has several known issues. If you're interested in that pass, I have some patches up to address them: https://reviews.llvm.org/D53887, https://reviews.llvm.org/D54189, https://reviews.llvm.org/D54244. I'd certainly appreciate your feedback! |
Fix issue when terminator instruction in predecessor is switch
Add unit test for PHIs with one incoming value from region
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
635 ↗ | (On Diff #175751) | Seems that it's fixed now, thank you for explaining the problem! |
650 ↗ | (On Diff #175751) | Currently I'm facing the problem that hotcoldsplitting adds to ColdRegion one basic block 2 times (first time when visiting predecessors of SinkBB, second time when visitting successors) - it causes CodeExtractor check fail. Not sure that these patches don't solve the problem, but if it is interesting, I can provide test case for this (what is the best way to do it?) |
Thank you, LGTM.
I tested this patch by:
- Building LNT+externals with hot/cold splitting enabled. I forced outlining to occur whenever a block has more than 1 predecessor, so long as it wouldn't result in the entire function being outlined. This resulted in 48,441 cold functions being extracted/outlined. All output validation tests still passed.
- Running check-llvm in a stage2 build with hot/cold splitting enabled in the same way described above.
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
650 ↗ | (On Diff #175751) | D54189 fixes that bug, but I'm not sure whether it still applies cleanly. I'll update it soon. |