Hi all,
This patch improves the logic implemented in CodeGenPrepare (committed at revision 224899 - see review D6728) that teaches the backend when it is profitable to speculate calls to cttz/ctlz.
The original algorithm conservatively avoided speculating more than one instruction from a basic block in a control flow graph modelling an if-statement. In particular, the only allowed instruction (excluding the terminator) was a call to cttz/ctlz.
However, there are cases where we could be less conservative and still be able to speculate a call to cttz/ctlz.
Example:
/code
define i64 @test(i32 %x) {
entry:
%tobool = icmp eq i32 %x, 0 br i1 %tobool, label %cond.end, label %cond.true
cond.true: ; preds = %entry
%0 = tail call i32 @llvm.cttz.i32(i32 %x, i1 true) %phitmp2 = zext i32 %0 to i64 br label %cond.end
cond.end: ; preds = %entry, %cond.true
%cond = phi i64 [ %phitmp2, %cond.true ], [ 32, %entry ] ret i64 %cond
}
/code
The cttz from basic block %cond.true could be safely speculated if we know that the 'zext' is "free" for the target. The same reasoning applies to the case where the value produced by the cttz/ctlz is truncated rather than zero extended, and the extra truncate instruction is known to be "free" for the target.
The 'zext' from the example above would be "free" on a x86-64 target. So, if BMI is available on the target, then the entire code from function @test could be safely expanded into a single 'tzcntl' followed by a return statement.
With this patch, CodeGenPrepare now tries to speculate a cttz/ctlz if the result is zero extended/ truncated in the same basic block, and the zext/trunc instruction is free for the target.
This fixes (i.e. improves) all the new test cases added by this patch in 'CodeGen/X86/cttz-ctlz.ll'.
Please let me know if ok to submit.
Thanks,
Andrea