This is an archive of the discontinued LLVM Phabricator instance.

[X86] Unsigned saturation subtraction canonicalization [the backend part]
ClosedPublic

Authored by yulia_koval on Sep 6 2017, 1:46 PM.

Details

Summary

This is the backend part of unsigned saturation canonicalization patch. https://reviews.llvm.org/D37510

The patch transforms canonical version of unsigned saturation, which is sub(max(a,b),a) or sub(a,min(a,b)) to special psubus insturuction on targets, which support it(8bit and 16bit uints).
umax(a,b) - b -> subus(a,b)
a - umin(a,b) -> subus(a,b)

There is also extra case handled, when right part of sub is 32 bit and can be truncated, using UMIN(this transformation was discussed in https://reviews.llvm.org/D25987).

The example of special case code:

void foo(unsigned short *p, int max, int n) {

  int i;
  unsigned m;
  for (i = 0; i < n; i++) {
    m = *--p;
    *p = (unsigned short)(m >= max ? m-max : 0);
  }
}

Max in this example is truncated to max_short value, if it is greater than m, or just truncated to 16 bit, if it is not. It is vaid transformation, because if max > max_short, result of the expression will be zero.

Here is the table of types, I try to support, special case items are bold:

Size128256512
i8v16i8v32i8v64i8
i16v8i16v16i16v32i16
i32v8i32v16i32
i64v8i64

Diff Detail

Event Timeline

yulia_koval created this revision.Sep 6 2017, 1:46 PM
zvi set the repository for this revision to rL LLVM.
zvi added a subscriber: llvm-commits.
zvi edited edge metadata.Sep 11 2017, 7:19 AM

Please add some details about the combine in the Summary. It's better to put a description in addition to the links to make it easier to read the commit log.
Maybe something like:
Combine:
umax(a,b) - b -> subus(a,b)
a - umin(a,b) -> subus(a,b)

And better add an X86 label to the title.

lib/Target/X86/X86ISelLowering.cpp
35489

Would a better place for this comment be right after the '// Try to find ... ' comment?

35508

Please drop braces for if/else with a single statement

35538

Can we generalize by using DAG.computeKnownBits to tells us that the upper bits are zeroed?

yulia_koval edited the summary of this revision. (Show Details)

Added 512bit support and fixed comments

yulia_koval retitled this revision from Unsigned saturation subtraction canonicalization [the backend part] to [X86] Unsigned saturation subtraction canonicalization [the backend part].Sep 26 2017, 10:27 PM
yulia_koval edited the summary of this revision. (Show Details)
zvi added a comment.Sep 27 2017, 2:43 AM

Please add the avx512vl feature to the AVX512 test (in the parent patch) and rebase this patch. It will drastically improve some of the cases :).

lib/Target/X86/X86ISelLowering.cpp
35645

Please drop the braces

zvi added inline comments.Sep 28 2017, 7:06 AM
test/CodeGen/X86/psubus.ll
2383 ↗(On Diff #116797)

Looks like in this case SSE4.1, AVX1 and AVX2 regressed. Can you please take a look at it?

Fixed problem on min case.

yulia_koval marked an inline comment as done.Oct 2 2017, 12:58 AM
zvi accepted this revision.Oct 9 2017, 12:12 AM

LGTM. Thanks!

This revision is now accepted and ready to land.Oct 9 2017, 12:12 AM

Thanks! Could you please commit it for me?

This revision was automatically updated to reflect the committed changes.
spatel added inline comments.Oct 11 2017, 10:35 AM
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
35928–35930 ↗(On Diff #118259)

Sorry I didn't have a chance to look at this closely before commit, but why are we using isEqualTo() here? This pattern only applies to integer types, so we could use simple equality checks instead?

Thanks, fixed your comment in additional patch below.

spatel edited edge metadata.Oct 12 2017, 9:23 AM

Thanks, fixed your comment in additional patch below.

Thank you - committed with rL315589.