Look for PHI/Select in the same BB of the form
bb:
%p = phi [false, %bb1], [true, %bb2 ], [false, %bb3], [true, %bb4], ... %s = select p, trueval, falseval
And expand the select into a branch structure and this later enables threading over bb.
The motivation example is as follows
class H
{
public:
int a; int b; int c;
};
inline int operator < (H& A, H& B)
{
return (A.a < B.a) ? 1 : (A.a > B.a) ? 0 : (A.b < B.b) ? 1 : (A.b > B.b) ? 0 : A.c < B.c;
}
int s(H *h, int j)
{
if (h[j] < h[j+1]) j+=2; return j;
}
The generated assembly in AArch64 is like below. X86 assembly has similar problem.
If(h0.a < h1.a) goto bb0
if (h0.a > h1.a) goto bb1
If(h0.b < h1.b) goto bb2
if (h0.b > h1.b) goto bb3
flag = (h0.c < h1.c)
goto if_end
bb0: Flag = 1; Goto if_end
bb1: Flag = 0; Goto if_end
bb2: Flag = 1; Goto if_end
bb3: Flag = 0; Goto if_end
if_end: j = select flag, j+2, j
The trivial basic blocks bb0 – bb3 can cause performance penalty if the comparison (h[j] < h[j+1]) is in a hot loop. The IR code of if_end is like this
%cond = phi i1 [ false, … ], [ true, … ], [ false, … ], [ true, … ], [%flag, …] %add = add i32 %j, 2 %j.add = select i1 %cond.i, i32 %j, i32 %add
Every bool constant in the phi becomes a trivial basic block in the assembly. select is generated by SpeculativelyExecuteBB() of simplifyCFG before function inlining and prevents jump-threading to further optimize the CFG.
Maybe add to this comment to indicate that you're just looking for at least one constant.