[SimplifyCFG] do not sink intrinsics even with non-constant operands The test #12 in sink-common-code.ll fails not because @llvm.ctlz/cttz are intrinsics but because the last operand is constant, and thus canReplaceOperandWithVariable() will return false and the instructions won't be sunk. Presumably, we want to avoid sinking all intrinsics, not by luck like that. However, since "[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd", the pass is more aggressive and it sunks intrinsics where it should not. This is because the one PHI node limitation has been removed, but usually intrinsics have many operands. The bug being described was just hidden. Currently the pass doesn't allow to sink intrinsics (like other instructions) with constant operands and that makes sense, but it's not enough. Instead we should not sink any intrinsics even if one of the operand is non-constant. Some of them are specials, such as @llvm.frameaddress or @llvm.SI.image.sample.v2i32 (AMDGPU), and it's complicated to know if they can be sunk or not. This patch just adds an early exit in this situation. This fixes minor rendering issues with "Crusader Kings II" and recently released "Dirt Rally" (using the AMDGPU backend). Some intrinsics were wrongly removed. As a side effect, this has the advantage to reduce GPR pressure a little bit with AMDGPU. This only regresses no-md-sink.ll, but that makes sense if we avoid to sink all intrinsics. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=32001 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
It would nice if this patch (or a similar one) could be in the final LLVM 4.0 release because I think it's important for us in order to fix these two games, at least.
If this change is not the right one, maybe we can just re-introduce the one PHI node limitation and fix it up later?
For AMD folks, I see some good changes with my shader-dv private repo, here's the full report https://hastebin.com/xeloteleja.pas.
Thanks for your feedback.