In GlobalISel we may in some unfortunate circumstances generate PHIs with operands that are on separate banks. If-conversion doesn't currently check for that case and ends up generating a CSEL on AArch64 with incorrect register operands.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
What is the restriction on the register classes of PHI operands, if they don't have to have to be same register class as the result? I don't see any relevant documentation, or checks in MachineVerifier::checkPHIOps().
llvm/lib/CodeGen/EarlyIfConversion.cpp | ||
---|---|---|
522 | Not sure why we want to avoid a cross-class copy here; won't we generate a cross-class copy anyway in PHIElimination? |
My initial reaction was that the PHI was invalid, but according to @qcolombet it's ok to have different register banks on the operands. The problem here is that ifcvt tries to generate a target instruction, AArch64::CSELXr in place of the PHI with incompatible regclasses. If we leave it to PHI elimination it works fine.
llvm/lib/CodeGen/EarlyIfConversion.cpp | ||
---|---|---|
522 | The comment isn't correct, it's from an earlier version of the fix. What we actually want to avoid is a set of disjoint regclasses for the operands and destination register. |
My initial reaction was that the PHI was invalid, but according to @qcolombet it's ok to have different register banks on the operands
Okay... but the operand register class can't actually be *anything*, can it? We have restrictions on COPY instructions.
The problem here is that ifcvt tries to generate a target instruction, AArch64::CSELXr in place of the PHI with incompatible regclasses. If we leave it to PHI elimination it works fine.
I assume the reason PHI elimination works is that it inserts a cross-class COPY. We can do the same here. The question would just be whether it's profitable; I guess we have to consider the cost of speculating a copy.
Those restrictions are target specific though aren’t they? I think it’s the responsibility of ISel to generate valid copies and similar should apply to PHIs.
The problem here is that ifcvt tries to generate a target instruction, AArch64::CSELXr in place of the PHI with incompatible regclasses. If we leave it to PHI elimination it works fine.
I assume the reason PHI elimination works is that it inserts a cross-class COPY. We can do the same here. The question would just be whether it's profitable; I guess we have to consider the cost of speculating a copy.
Yeah, we’d definitely need copy cost heuristics. FPR <-> GPR copies can be a 5 cycles on some AArch64 uarchs.
Okay.
I think this check should be the responsibility of specific targets, in canInsertSelect(). The target-specific methods already check similar conditions (whether the "true" and "false" registers have a common subclass).
Not sure why we want to avoid a cross-class copy here; won't we generate a cross-class copy anyway in PHIElimination?