Page MenuHomePhabricator

[InstCombine] Teach how to fold a select into a cttz/ctlz with the 'is_zero_undef' flag cleared.
ClosedPublic

Authored by andreadb on Jan 9 2015, 4:05 AM.

Details

Summary

Hi all,

This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared.

Example:

%a = tail call i32 @llvm.cttz.i32(i32 %x, i1 true)
%b = icmp ne i32 %x, 0
%c = select i1 %b, i32 %a, i32 32

In this example, the condition value used by the select instruction compares value %x for equality against zero.
Value %x is also used by the call to the llvm intrinsic cttz. So, the select instruction would propagate the sizeof in bits of %x (i.e. 32) if %x is zero.

The entire cttz+icmp+select sequence can be safely folded into:

%c = tail call i32 @llvm.cttz.i32(i32 %x, i1 false)

Added test InstCombine/select-cmp-cttz-ctlz.ll.

Please let me know if ok to submit.

Thanks,
Andrea

Diff Detail

Event Timeline

andreadb updated this revision to Diff 17922.Jan 9 2015, 4:05 AM
andreadb retitled this revision from to [InstCombine] Teach how to fold a select into a cttz/ctlz with the 'is_zero_undef' flag cleared..
andreadb updated this object.
andreadb edited the test plan for this revision. (Show Details)
andreadb added reviewers: hfinkel, majnemer, RKSimon.
andreadb added a subscriber: Unknown Object (MLST).
majnemer added inline comments.Jan 9 2015, 9:56 AM
lib/Transforms/InstCombine/InstCombineSelect.cpp
465–468

@llvm.cttz.* and @llvm.ctlz.* both accept vector types as arguments. You may want to switch from m_ConstantInt to m_APInt because it will match against a splatted ConstantVector as well.

You could go even further and use m_SpecificInt(II->getType()->getScalarSizeInBits()) to save you a check later on.

478–483

You could use m_Specific(CmpLHS) instead of m_Value(V) and V != CmpLHS.

492–498

I wonder if it might be nicer to clone II and then fixup the clone.
Something like:

NewI = II->clone();
NewI->setOperand(1, Constant::getNullValue(II->getArgOperand(1)->getType()));

This formulation also has the advantage of working with vector types.

pete added a subscriber: pete.Jan 9 2015, 10:05 AM

Hi Andrea

Is this intended to replace the work in CodeGenPrepare:r225274?

If you match this to a single intrinsic call in instcombine then it would be relatively simple for SimplifyCFG to then speculate it with existing code. Then we can remove the code from CGP?

Thanks,
Pete

Hi Pete,

In D6891#106740, @pete wrote:

Hi Andrea

Is this intended to replace the work in CodeGenPrepare:r225274?

No, It applies to a different scenario where the cttz/ctlz is always evaluated.

Something like:

unsigned int foo(unsigned int x) {
  unsigned int count = __builtin_ctz(x);
  return x ? count : 32;
}

Where the count trailing zeroes is always evaluated before reaching the conditional statement (which is then converted into a select).

The logic added in CodeGenPrepare would work in a different scenario (see below):

unsigned int bar(unsigned int x) {
  return x ? __builtin_ctz(x) : 32;
}

In this case, the builtin call is not always executed since it is not dominating the control flow (it is in the 'then' part). Depending on the target, it may or may not be beneficial to speculate that builtin call.

If you match this to a single intrinsic call in instcombine then it would be relatively simple for SimplifyCFG to then speculate it with existing code. Then we can remove the code from CGP?

The problem with implementing that logic into SimplifyCFG is that we need to query the target to check if calls to cttz/ctlz are cheap to speculate. Therefore, in code review D6679 it was suggested by the reviewers to move that logic into CodeGenPrepare.

I hope this clears up any misunderstanding.

-Andrea

Thanks,
Pete

pete added a comment.Jan 9 2015, 11:23 AM

Hi Andrea

andreadb updated this revision to Diff 17944.Jan 9 2015, 1:26 PM

Hi David,

thanks for the review.
I uploaded a new version of the patch that should address all your comments.

Please let me know what you think.

Thanks again for your time.
-Andrea

andreadb updated this revision to Diff 18014.Jan 12 2015, 4:43 AM

Sorry, I just realized that I uploaded a wrong version of the patch.
This is the correct version of the patch.
Again, sorry for the confusion.

-Andrea

RKSimon edited edge metadata.Jan 22 2015, 10:26 AM

Thanks Andrea, I'd prefer David to give the final approval but this patch works fine with my local tests.

hfinkel accepted this revision.Jan 27 2015, 7:03 AM
hfinkel edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Jan 27 2015, 7:03 AM
andreadb closed this revision.Jan 27 2015, 8:01 AM

Thanks Hal,

Committed revision 227197.