HomePhabricator

[X86] Add pattern matching for PMADDUBSW

Authored by craig.topper on Jul 31 2018, 10:12 AM.

Description

[X86] Add pattern matching for PMADDUBSW

Summary:
Similar to D49636, but for PMADDUBSW. This instruction has the additional complexity that the addition of the two products saturates to 16-bits rather than wrapping around. And one operand is treated as signed and the other as unsigned.

A C example that triggers this pattern

static const int N = 128;

int8_t A[2*N];
uint8_t B[2*N];
int16_t C[N];

void foo() {
  for (int i = 0; i != N; ++i)
    C[i] = MIN(MAX((int16_t)A[2*i]*(int16_t)B[2*i] + (int16_t)A[2*i+1]*(int16_t)B[2*i+1], -32768), 32767);
}

Reviewers: RKSimon, spatel, zvi

Reviewed By: RKSimon, zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49829

llvm-svn: 338402

Details

Committed
craig.topperJul 31 2018, 10:12 AM
Reviewer
RKSimon
Differential Revision
D49829: [X86] Add pattern matching for PMADDUBSW
Parents
rGd03d44e0b963: [X86] Add test cases that could use PMADDUBSW.
Branches
Unknown
Tags
Unknown