This is an archive of the discontinued LLVM Phabricator instance.

[CodeGenPrepare] Improved logic to speculate calls to cttz/ctlz.
ClosedPublic

Authored by andreadb on Jan 6 2015, 5:31 AM.

Details

Summary

Hi all,

This patch improves the logic implemented in CodeGenPrepare (committed at revision 224899 - see review D6728) that teaches the backend when it is profitable to speculate calls to cttz/ctlz.

The original algorithm conservatively avoided speculating more than one instruction from a basic block in a control flow graph modelling an if-statement. In particular, the only allowed instruction (excluding the terminator) was a call to cttz/ctlz.
However, there are cases where we could be less conservative and still be able to speculate a call to cttz/ctlz.

Example:
/code
define i64 @test(i32 %x) {
entry:

%tobool = icmp eq i32 %x, 0
br i1 %tobool, label %cond.end, label %cond.true

cond.true: ; preds = %entry

%0 = tail call i32 @llvm.cttz.i32(i32 %x, i1 true)
%phitmp2 = zext i32 %0 to i64
br label %cond.end

cond.end: ; preds = %entry, %cond.true

%cond = phi i64 [ %phitmp2, %cond.true ], [ 32, %entry ]
ret i64 %cond

}
/code

The cttz from basic block %cond.true could be safely speculated if we know that the 'zext' is "free" for the target. The same reasoning applies to the case where the value produced by the cttz/ctlz is truncated rather than zero extended, and the extra truncate instruction is known to be "free" for the target.

The 'zext' from the example above would be "free" on a x86-64 target. So, if BMI is available on the target, then the entire code from function @test could be safely expanded into a single 'tzcntl' followed by a return statement.

With this patch, CodeGenPrepare now tries to speculate a cttz/ctlz if the result is zero extended/ truncated in the same basic block, and the zext/trunc instruction is free for the target.
This fixes (i.e. improves) all the new test cases added by this patch in 'CodeGen/X86/cttz-ctlz.ll'.

Please let me know if ok to submit.

Thanks,
Andrea

Diff Detail

Repository
rL LLVM

Event Timeline

andreadb updated this revision to Diff 17829.Jan 6 2015, 5:31 AM
andreadb retitled this revision from to [CodeGenPrepare] Improved logic to speculate calls to cttz/ctlz..
andreadb updated this object.
andreadb edited the test plan for this revision. (Show Details)
andreadb added reviewers: hfinkel, qcolombet, RKSimon.
andreadb added subscribers: Unknown Object (MLST), test.
andreadb updated this revision to Diff 17830.Jan 6 2015, 7:08 AM

Here is an updated patch.
I improved a couple of comments and added more conservative checks.

hfinkel accepted this revision.Jan 6 2015, 9:24 AM
hfinkel edited edge metadata.

LGTM.

On PPC64, zext is not generally free, but is free after ctlz. We can deal with that later, however.

This revision is now accepted and ready to land.Jan 6 2015, 9:24 AM
This revision was automatically updated to reflect the committed changes.

LGTM.

On PPC64, zext is not generally free, but is free after ctlz. We can deal with that later, however.

Thanks for the quick review Hal!
Committed revision 225274.

-Andrea