The issue affects targets supporting fast-lzcnt such as btver2.
This removes extraneous zext/trunc node insertions to fix the infinite loop.
This fixes Issue https://github.com/llvm/llvm-project/issues/54694
Details
Diff Detail
Unit Tests
Event Timeline
Do you know which commit introduced the problem, or which transforms conflict?
This looks like a pretty heavy hammer.
Thank you Roman for reviewing!
The regression appeared after https://github.com/llvm/llvm-project/commit/8a156d1c2795189389fadbf33702384f522f2506 (thank you @wristow for finding it)
As @RKSimon found out, the IR does not get generated anymore from the original C++ test case but the issue is still present.
I am hopeful we can reintroduce the optimization (at a later codegen stage maybe?) but I would like to do some performance analysis first to see if it is still a win.
That change might have exposed the bug from source/IR, but the backend loop has existed since at least the 10.0 release based on a godbolt check:
https://godbolt.org/z/Tfnaevboz
Yes, I agree fixing would be best, unfortunately I can’t see an obvious way to fix it inside DAG combiner nor if one the transformation involved is doing something wrong.
My understanding of the issue is that some optimizations disagree on which transformation is the best, causing a ping pong effect. I thought we could eventually fix it by moving the optimisation to the instruction selection stage, any other ideas?
@RKSimon suggested the transformation might be caused by unneeded trunc/zext operations.
It is causing the DAG combiner to needlessly sink them under the OR operations so I have removed them where possible and this fixes the infinite loop.
The instructions generated end up being reordered is some cases but I think it is equivalent.
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
47561–47562 | Is Ret ever null at this point? |
Is Ret ever null at this point?