This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Non-canonical clamp-like pattern handling
ClosedPublic

Authored by lebedev.ri on Aug 5 2019, 11:34 AM.

Details

Summary

Given a pattern like:

%old_cmp1 = icmp slt i32 %x, C2
%old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high
%old_x_offseted = add i32 %x, C1
%old_cmp0 = icmp ult i32 %old_x_offseted, C0
%r = select i1 %old_cmp0, i32 %x, i32 %old_replacement

it can be rewritten as more canonical pattern:

%new_cmp1 = icmp slt i32 %x, -C1
%new_cmp2 = icmp sge i32 %x, C0-C1
%new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x
%r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low

Iff -C1 s<= C2 s<= C0-C1
Also, ULT predicate can also be UGE; or UGT iff C0 != -1 (+invert result)
Also, SLT predicate can also be SGE; or SGT iff C2 != INT_MAX (+invert result)

If C1 == 0, then all 3 instructions must be one-use; else at most either %old_cmp1 or %old_x_offseted can have extra uses.

NOTE: if we could reuse %old_cmp1 as one of the comparisons we'll have to build, this could be less limiting.

So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants:

This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch.
This patch requires D65530.

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.Aug 5 2019, 11:34 AM

Check that constants are integral up-front, don't try to deal with pointers/constant expressions.

spatel added inline comments.Aug 12 2019, 1:40 PM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
1127 ↗(On Diff #213444)

rewriten -> rewritten

1133–1134 ↗(On Diff #213444)

'Also/also' repeated in each statement; can remove 1 or the other.

1169–1170 ↗(On Diff #213444)

There is no test with UGE predicate? Does this require a non-splat vector constant?

1180 ↗(On Diff #213444)

shold -> should

1206 ↗(On Diff #213444)

Remove ':/' - doesn't add value; might cause confusion for some readers.

1230–1231 ↗(On Diff #213444)

There is no test with SGE predicate? Does this require a non-splat vector constant?

Thanks for taking a look.

I'm not sure how to test uge/sge predicates - i can't come up with any pattern that still contains them after -instcombine.
https://godbolt.org/z/7EAcfy
I'm not sure if that is guaranteed (and thus i should just drop that handling), or i'm simply failing at test-case production.

Thanks for taking a look.

I'm not sure how to test uge/sge predicates - i can't come up with any pattern that still contains them after -instcombine.
https://godbolt.org/z/7EAcfy
I'm not sure if that is guaranteed (and thus i should just drop that handling), or i'm simply failing at test-case production.

Non-splat vector constant + multiple-uses? :)
https://godbolt.org/z/6vuO14

Thanks for taking a look.

I'm not sure how to test uge/sge predicates - i can't come up with any pattern that still contains them after -instcombine.
https://godbolt.org/z/7EAcfy
I'm not sure if that is guaranteed (and thus i should just drop that handling), or i'm simply failing at test-case production.

Non-splat vector constant + multiple-uses? :)
https://godbolt.org/z/6vuO14

Right, thanks!.
But that icmp *must* be single-use for this transform to happen. So sadly that does not help uge question.
That will help sge though.

lebedev.ri marked 6 inline comments as done.

Addressed review notes, dropper uge predicate handling - requires extra use,
which prevents this fold from happening, added sge predicate tests - this one can happen.

spatel accepted this revision.Aug 13 2019, 5:16 AM

LGTM

This revision is now accepted and ready to land.Aug 13 2019, 5:16 AM

LGTM

Thank you for the review!

This revision was automatically updated to reflect the committed changes.
lebedev.ri added a comment.EditedAug 13 2019, 7:21 AM

Hm, two more things for future consideration here:

  1. matchClamp() in ValueTracking.cpp requires the constant in icmp and in select to match, yet the fold to ensure that is intentionally scalar-only: https://godbolt.org/z/3lorhB vs https://godbolt.org/z/_XN2yO
  2. The pattern we produce here isn't *really* a clamp, both comparisons are of the same original value; we may want to pull the comparison for the second select through the first select, but we will only be able to do that if we know that it won't change the result of that comparison (i.e. the replacement value in innermost select needs to be select, too)

Hm, two more things for future consideration here:

  1. matchClamp() in ValueTracking.cpp requires the constant in icmp and in select to match, yet the fold to ensure that is intentionally scalar-only: https://godbolt.org/z/3lorhB vs https://godbolt.org/z/_XN2yO
  2. The pattern we produce here isn't *really* a clamp, both comparisons are of the same original value; we may want to pull the comparison for the second select through the first select, but we will only be able to do that if we know that it won't change the result of that comparison (i.e. the replacement value in innermost select needs to be select, too)

Err, the first issue is actually caused by the second one.