Page MenuHomePhabricator

[InstCombine] Extra combine for uadd_sat
ClosedPublic

Authored by dmgreen on Mon, Oct 21, 2:18 AM.

Details

Summary

This is an extra fold for a canonical for of uadd_sat, as shown in D68651. Signed patterns for these are a little more involved, but unsigned is a simple enough extension to what was already present.

Name: uadd_sat_canon             
  %3 = add i8 %0, %1
  %4 = icmp ult i8 %3, %0
  %5 = select i1 %4, i8 -1, i8 %3
=>
  %5 = uadd_sat %1, %0

Diff Detail

Event Timeline

dmgreen created this revision.Mon, Oct 21, 2:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptMon, Oct 21, 2:18 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
lebedev.ri added inline comments.Mon, Oct 21, 3:36 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
785–787

Maybe

if (match(Cmp0, m_c_Add(m_Specific(Cmp1), m_Value(Y))) &&
    match(FVal, m_c_Add(m_Specific(Cmp1), m_Specific(Y)))) {
dmgreen updated this revision to Diff 225891.Mon, Oct 21, 8:17 AM
dmgreen marked an inline comment as done.

Update pattern

lebedev.ri accepted this revision.Mon, Oct 21, 8:35 AM

LG

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
784

This comment needs rewording.

This revision is now accepted and ready to land.Mon, Oct 21, 8:35 AM

@dmgreen if you're interested, there is at least one edge-case pattern:
https://godbolt.org/z/0dICgS

----------------------------------------
  %2 = icmp eq i32 %0, 0
  %3 = add i32 %0, -1
  %4 = select i1 %2, i32 0, i32 %3
  ret i32 %4
=>
  %4 = usub_sat i32 %0, 1
  ret i32 %4
  %3 = add i32 %0, -1
  %2 = icmp eq i32 %0, 0

Done: 1
Optimization is correct!
nikic added a comment.Mon, Oct 28, 6:56 AM

@lebedev.ri Might use m_UAddWithOverflow(), iirc it handles all these edge cases.

Err, make that two:
https://godbolt.org/z/F4uPIw

----------------------------------------
  %2 = icmp ugt i32 %0, 1
  %3 = add i32 %0, -2
  %4 = select i1 %2, i32 %3, i32 0
  ret i32 %4
=>
  %4 = usub_sat i32 %0, 2
  ret i32 %4
  %3 = add i32 %0, -2
  %2 = icmp ugt i32 %0, 1

Done: 1
Optimization is correct!

Thanks. I had seen that usub_sat was already handled in most cases, but hadn't seen those cases of it being constants.

It looks like canonicalizeSaturatedSubtract only handles sub's, not the add of a negative constant.

This revision was automatically updated to reflect the committed changes.