Page MenuHomePhabricator

Unsigned saturation subtraction canonicalization [Instcombine part]

Authored by yulia_koval on Dec 20 2017, 11:55 PM.



This is the instcombine part of unsigned saturation canonicalization patch.(Backend patches already commited:,

It converts unsigned saturated subtraction patterns to pattern, recognized by backend:
(a > b) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b < a) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b > a) ? 0 : a - b -> ((a > b) ? a : b) - b)
(a < b) ? 0 : a - b -> ((a > b) ? a : b) - b)
((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b)

Diff Detail


Event Timeline

yulia_koval created this revision.Dec 20 2017, 11:55 PM
yulia_koval added reviewers: zvi, spatel, DavidKreitzer.
yulia_koval removed subscribers: spatel, zvi, DavidKreitzer.
zvi added inline comments.Dec 21 2017, 4:46 AM
905 ↗(On Diff #127833)

"subus" might be too X86-specific in a target-independent pass? Maybe a better function name would be something like "canonicalizeSaturatedSubtract"?

5 ↗(On Diff #127833)

Missing vector tests?

Fixed comments,

zvi added a comment.Dec 23 2017, 11:45 PM

LGTM, but would like to see a more seasoned InstCombine contributer take a look before giving a final ok

spatel added inline comments.Dec 28 2017, 8:03 AM
567–570 ↗(On Diff #127988)

Let's avoid 'psubus' in these comments too. Use 'unsigned saturated subtraction' or something generic.

580–581 ↗(On Diff #127988)

We should not create an unnecessary instruction here. We just want to swap the operands of the sub when 'IsNegative' is true?

  %cmp = icmp ugt i8 %lhs, %rhs
  %sub = sub i8 %rhs, %lhs
  %r = select i1 %cmp, i8 %sub, i8 0
  %cmp2 = icmp ugt i8 %lhs, %rhs
  %sel1 = select i1 %cmp2, i8 %lhs, i8 %rhs
  %r = sub i8 %rhs, %sel1

21–36 ↗(On Diff #127988)

The '256' and '512' tests are low-value. If one vector type works, we can generally assume that all vector types work in IR.

I'd like to see at least 1 test for each of the 8 formulas shown in the summary. A comment with that formula above the test would also make it clearer what's changing in each test. You can alternate scalar and vector types to provide more coverage.

yulia_koval added inline comments.Dec 28 2017, 8:26 AM
580–581 ↗(On Diff #127988)

Right, it is legal just to swap. The question is - do we consider the result with swapped operands a cannonical form?
%cmp2 = icmp ugt i8 %lhs, %rhs
%sel1 = select i1 %cmp2, i8 %lhs, i8 %rhs
%r = sub i8 %rhs, %sel1

rhs - (lhs > rhs ? lhs : rhs) <=> rhs - umax(lhs,rhs)

While we currenty consider cannonical only these forms:
umax(a,b) - b -> subus(a,b)
a - umin(a,b) -> subus(a,b)

So, what do you think, where should we do the transformation b - umax(a,b) => -(umax(a,b) - b) ?
Or whether we should handle -psubus(a,b) pattern at all?

spatel added inline comments.Dec 28 2017, 8:34 AM
580–581 ↗(On Diff #127988)

I don't understand the question. Instcombine is going to optimize this:

define i8 @subneg(i8 %x, i8 %y) {
  %a = sub i8 %x, %y
  %b = sub i8 0, %a
  ret i8 %b

So we don't have a choice here, right? The backend has to recognize that pattern. The important part (I think - let me know if I'm missing something) is that we have the canonical min/max pattern where the cmp and select operands are the same because that is recognized by value tracking:

%cmp2 = icmp ugt i8 %lhs, %rhs
%sel1 = select i1 %cmp2, i8 %lhs, i8 %rhs

Fixed the comments

spatel added a comment.EditedFeb 2 2018, 2:45 PM

Some tests were not actually providing coverage for this code, so I checked them in and fixed them - rL324143.
I think this is correct now and tested properly (although you might want to add more tests to check UGE/ULE preds - just noticed that I forgot about that). Please rebase and address the small code/comment suggestions.

541–542 ↗(On Diff #132331)

This doesn't read clearly to me. How about:
Transform patterns such as: (a > b) ? a - b : 0
into: ((a > b) ? a : b) - b)
This produces a canonical max pattern that is more easily recognized by the backend and converted into saturated subtraction instructions if those exist.
There are 8 commuted/swapped variants of this pattern.

552–554 ↗(On Diff #132331)

Sink these declarations down closer to their use.

579–584 ↗(On Diff #132331)

It's easier to follow the logic if you do all of the sub matching in one place, so sink the hasOneUse check down here:

// We need a sub with the same operands as the compare.
bool SelOnRight;
if (match(TrueVal, m_OneUse(m_Sub(m_Specific(CmpL), m_Specific(CmpR)))))
  SelOnRight = false;
else if (match(TrueVal, m_OneUse(m_Sub(m_Specific(CmpR), m_Specific(CmpL)))))
  SelOnRight = true;
  return nullptr;
spatel added inline comments.Feb 2 2018, 2:47 PM
552–554 ↗(On Diff #132331)

Also, it's easier to read if your variable names match the comments:

Value *A = ICI->getOperand(0);
Value *B = ICI->getOperand(1);

Thank you for your comments, fixed them.

spatel accepted this revision.Feb 5 2018, 6:13 AM

LGTM - see inline for small bits.

Thanks for getting this fixed from codegen to IR!

586 ↗(On Diff #132787)

This comment isn't correct. We wouldn't duplicate the existing sub; we just wouldn't be able to eliminate it. So the transform could create more instructions than it removes which we usually try to avoid in this pass.

590–591 ↗(On Diff #132787)

Here and above, please put a period at the end of a sentence:

This revision is now accepted and ready to land.Feb 5 2018, 6:13 AM

Fixed last comments. Thank you for your help, could you please commit it for me?

This revision was automatically updated to reflect the committed changes.