This is an archive of the discontinued LLVM Phabricator instance.

[CodeGenPrepare] Teach when it is profitable to speculate calls to @llvm.cttz/ctlz.
ClosedPublic

Authored by andreadb on Dec 18 2014, 12:42 PM.

Details

Summary

Following up on the suggestions from D6679, here is a new patch to optimize cttz/ctlz in CodeGenPrepare.

If we know that the control flow is modelling an if-statement where the only instruction in 'then' basic block (excluding the terminator) is a call to cttz/ctlz, CodeGenPrepare can try to speculate the cttz/ctlz call and simplify the control flow graph.

Example:
;;
entry:

%cmp = icmp eq i64 %Val, 0
br i1 %cmp, label %end.bb, label %then.bb

then.bb:

%c = tail call i64 @llvm.cttz.i64(i64 %val, i1 true)
br label %EndBB

end.bb:

%cond = phi i64 [ %c, %then.bb ], [ 64, %entry]
...

;;

The call to @llvm.cttz.i64 could be speculated. This would allow to get rid of 'then.bb' and merge %entry.bb with %end.bb.

The constraints are:
a) The 'then' basic block is taken only if the input operand to the cttz/ctlz is different than zero;
b) The phi node propagates the size-of (in bits) of the value %val in input to the cttz/ctlz if %val is zero.
c) The target says that it is "cheap" to speculate cttz/ctlz.

If all these constraints are met, CodeGenPrepare can hoist the call to cttz/ctlz from the 'then' basic block into the 'entry' basic block. The new cttz/ctlz instruction will also have the 'undef on zero' flag set to 'false'.

I added two new hooks in TargetLowering.h to let targets customize the behavior (i.e. decide whether it is cheap or not to speculate calls to cttz/ctlz). The two new methods are 'isCheapToSpeculateCtlz' and 'isCheapToSpeculateCttz'.
By default, both methods return 'false'. Which means, CodeGenPrepare doesn't try to speculate calls to cttz/ctlz unless the target says that it is profitable to do it.

On X86, method 'isCheapToSpeculateCtlz' returns true only if the target has LZCNT. Method 'isCheapToSpeculateCttz' only returns true if the target has BMI.
This may change in future. For now, I avoided to enable the transformation for all x86-64 targets with feature CMOV because I am not 100% it is always a win to speculate bsf/bsr. So, I left a couple of TODO comments in the code.

Please let me know what you think.

Thanks!
Andrea

Diff Detail

Repository
rL LLVM

Event Timeline

andreadb updated this revision to Diff 17464.Dec 18 2014, 12:42 PM
andreadb retitled this revision from to [CodeGenPrepare] Teach when it is profitable to speculate calls to @llvm.cttz/ctlz..
andreadb updated this object.
andreadb edited the test plan for this revision. (Show Details)
andreadb added reviewers: qcolombet, hfinkel, majnemer.
andreadb added a subscriber: Unknown Object (MLST).
andreadb updated this revision to Diff 17465.Dec 18 2014, 1:00 PM

Uploaded a new version of the patch. Previous patch had a wrong test in it. Sorry for the confusion.

andreadb updated this revision to Diff 17486.Dec 19 2014, 2:54 AM

Patch updated.
Added missing check for TLI.

hfinkel accepted this revision.Dec 23 2014, 5:04 PM
hfinkel edited edge metadata.

If your transformations fires, you need to make sure that ModifiedDT is set to true so that the DT will be recalculated. With that fixed, LGTM.

This revision is now accepted and ready to land.Dec 23 2014, 5:04 PM
This revision was automatically updated to reflect the committed changes.

Thanks for the review Hal!

I modified the patch so that ModifiedDT is set to true when this new transformation fires.
I also removed the TODO comments in X86ISelLowering.cpp as suggested by Chandler.
Committed revision 224899.

-Andrea