This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Expand usub_sat patterns to handle constants
ClosedPublic

Authored by dmgreen on Oct 28 2019, 8:36 AM.

Details

Summary

The constants come through as add %x, -C, not a sub as would be expected. They need some extra matchers to canonicalise them towards usub_sat.

----------------------------------------
Optimization: Forwards
Precondition: true
  %cmp = icmp ugt i32 %a, 10
  %sub = add i32 %a, -10
  %sel = select i1 %cmp, i32 %sub, i32 0
=>
  %sel = usub_sat %a, 10

Done: 1
Optimization is correct!

----------------------------------------
Optimization: Backwards
Precondition: true
  %cmp = icmp ult i32 %a, 2
  %sub = add i32 %a, -2
  %sel = select i1 %cmp, i32 %sub, i32 0
=>
  %x = usub_sat 2, %a
  %sel = sub 0, %x

Done: 1
Optimization is correct!

Diff Detail

Event Timeline

dmgreen created this revision.Oct 28 2019, 8:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2019, 8:36 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
dmgreen edited the summary of this revision. (Show Details)Oct 28 2019, 8:36 AM

Looks like there should be some extra patterns for "one-off" constants too.

Thank you for looking into this.

llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll
270–271

In this pattern either one of these two instructions must be one-use and go away.

lebedev.ri requested changes to this revision.Oct 30 2019, 2:23 PM
This revision now requires changes to proceed.Oct 30 2019, 2:23 PM
dmgreen updated this revision to Diff 228272.Nov 7 2019, 10:05 AM

Sorry for the delay. I was trying to look into the "off by one" patterns, the last two of these:

Name: Forwards
Pre: C1 == -C2
  %cmp = icmp ugt i8 %a, C1
  %sub = add i8 %a, C2
  %sel = select i1 %cmp, i8 %sub, i8 0
=>
  %sel = usub_sat %a, C1

Name: Backwards
Pre: C1 == -C2
  %cmp = icmp ult i8 %a, C1
  %sub = add i8 %a, C2
  %sel = select i1 %cmp, i8 %sub, i8 0
=>
  %x = usub_sat C1, %a
  %sel = sub 0, %x

Name: Forwards_off
Pre: C1 + 1 == -C2 && C2 != 0
  %cmp = icmp ugt i8 %a, C1
  %sub = add i8 %a, C2
  %sel = select i1 %cmp, i8 %sub, i8 0
=>
  %sel = usub_sat %a, C1 + 1

Name: Backwards_off
Pre: C1 - 1 == -C2 && C1 != 0
  %cmp = icmp ult i8 %a, C1
  %sub = add i8 %a, C2
  %sel = select i1 %cmp, i8 %sub, i8 0
=>
  %x = usub_sat C1 - 1, %a
  %sel = sub 0, %x

They are a bit fiddly though and deserve to be their own commit.

lebedev.ri requested changes to this revision.Nov 10 2019, 10:20 AM

(marking as reviewed)

Will you also look into the edge-case patterns, like:

unsigned t0(unsigned num) {
    return num > 0 ? num-1 : num;
}

https://godbolt.org/z/DsBxd8

Name: unsigned minus 1
  %2 = icmp ugt i32 %0, 1
  %3 = add i32 %0, -1
  %4 = select i1 %2, i32 %3, i32 0
  ret i32 %4
=>
  %r = usub_sat %0, 1
  ret %r
  %3 = add i32 %0, -1
  %2 = icmp ugt i32 %0, 1
  %4 = select i1 %2, i32 %3, i32 0

Done: 1
Optimization is correct!

(not sure what other ones are there, this is the one that i noticed in real code..)

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
722

Err, this isn't quite right. I should have been more specific.

We produce a single instruction @llvm.usub.sat.,
so there shouldn't be a one-use check in general.

But we also may need to negate the result.
So *if* we need to negate the result,
*then* we need ensure that a *single* instruction
goes away - either icmp or sub.

This revision now requires changes to proceed.Nov 10 2019, 10:20 AM
dmgreen marked an inline comment as done.Nov 28 2019, 10:21 AM
dmgreen added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
722

OK. I see. I was thinking you were referring to assembly instructions being longer if the icmp had multiple uses.

There may be some benefit to making sure the output is actually smaller.

I've updated the code. Let me know if this was or wasn't what you were thinking.

dmgreen updated this revision to Diff 231452.Nov 28 2019, 10:21 AM

Sorry for the delay. Adjust this one use check.

lebedev.ri marked an inline comment as done.Nov 28 2019, 10:50 AM
lebedev.ri added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
707–708

I'm having doubt about the comment.
If b < a, then b - a will wrap, and we'll return wrapped result.
And if b >= a, then b - a will not wrap, but we will saturate to 0.
What am i missing?

722

Yep, this seems what i meant.

dmgreen marked 2 inline comments as done.Nov 28 2019, 12:46 PM
dmgreen added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
707–708

From the comment below, this looks like it is referring to
(a > b) ? b - a : 0 -> -usub.sat(a, b)
I'll try and update it to be clearer.

dmgreen updated this revision to Diff 231464.Nov 28 2019, 12:52 PM
lebedev.ri accepted this revision.Nov 28 2019, 1:22 PM
lebedev.ri marked an inline comment as done.

LGTM, thank you.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
707–708

Aha, okay, thank you.

722–723

I would recommend commit this as a separate commit first.

This revision is now accepted and ready to land.Nov 28 2019, 1:22 PM
This revision was automatically updated to reflect the committed changes.