- As LCSSA is turned on just before isel, it may create PHI of the flow, which is consumed by pseudo structurized CFG instructions. When that PHIs are eliminated in O0, COPY may be placed wrongly as the these pseudo structurized CFG instructions are considering prologue of MBB.
- Run extra unreachable-mbb-elimination at the end of isel to clean up PHIs.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
LCSSA created PHI of flow used by cf.end. These PHIs (due to LCSSA) are unnecessary. unreachable-mbb-elimination could remove them.
I'm still confused. When I try your testcase, I see an earlier iterator crash in SILowerControlFlow
llvm/test/CodeGen/AMDGPU/lcssa-optnone.ll | ||
---|---|---|
1–3 ↗ | (On Diff #208446) | that test case crash with the fix. I added CHECK-LABEL to check codegen could pass. Checking the code generated seems unnecessary to me. |
SILowerControlFlow runs after PHIElim, which converts PHI to COPY and could generate code like this in the last MBB
END_CF %v0, ....
%v0 = COPY %vX
the use (END_CF) is before the def (COPY).
the iterator in SILowereControlFlow starts from that def (COPY) to its use (END_CF). But, as they misplaced, it reaches the end of MBB first and triggers the assertion of illegal iterator access.
I think you should probably fix SILowerControlFlow
llvm/test/CodeGen/AMDGPU/lcssa-optnone.ll | ||
---|---|---|
1–3 ↗ | (On Diff #208446) | Tests with no checks are bad form. I think any tests related to control flow handling should probably just be generated |
the misplacement of def/use is not the fault of SILowerControlFlow but PHIEliminate. However, even PHIEliminate should not be blamed as cf.end is part of MBB prologue. cf.end should not take any input from PHI.
In O2 compilation, there won't be such issue as, in optimized RA, livevariable is used to eliminate PHI better. LiveVariables depends on unreachable-mbb-elimin, which remove the PHI to cf.end.
This isn't a property the pass should rely on. SILowerControlFlow should defend against unexpected iteration order and unreachable blocks
cf.end and related MBB prologue instructions should not take PHIs as input. Before LCSSA is turned on, we won't have that IR. After LCSSA, the only possible PHI input is the one from non-latch loop single exit. That PHI is unnecessary and should be removed early to assure that the later passes could relies on the fact they cf.end and etc. won't take PHI as input.
So the MIR is valid before phi elimination, and then becomes invalid after. Removing unreachable blocks earlier to avoid this happening is a workaround. It seems to me like we need to fix PHI elimination or avoid relying on the prolog instructions
my understanding is that MBB prologue instructions are designed NOT to take PHI as input. Like these pseudo instructions for structurized CFG, by design, they don't take any PHI as input. As an alternative approach (not workaround), we could ensure these instructions not taking PHI as input before PHIElim instead of teaching PHIElim to under them, more or less very target-specific stuff.
No matter what, running UnreachableMachineBlockElim is not a fix for this.
The verifier should probably check this. We definitely should not hack around this by deleting unreachable blocks that happens to run into this
But, the verifier so far doesn't change that. For your concerns, I'd like to take the following approach:
- Add target checker to assert those instructions with direct PHI input.
- Add a special pass after isel to canonicalize these pseudo CFG instructions by removing PHI inputs.
Does that sound to be acceptable?
I think PHIElimination::LowerPHINode needs to be taught to look for prolog instructions
llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | ||
---|---|---|
885 ↗ | (On Diff #208460) | Can you add a fixme that this is a workaround and should be removed |
Given that D65141 may avoid the source of this problem, I think we can go with this as a quick workaround for now. Once LCSSA is no longer required, we can add a verifier check that a phi is not used as the intrinsic source
Can we revert this now that D67101 has landed? I have tried locally reverting your change to addInstSelector and your lcssa-optnone.ll test case still passes.