Page MenuHomePhabricator

[PowerPC] Improve the way legalize mul for v8i16 and add pattern to match mul + add
ClosedPublic

Authored by steven.zhang on Tue, Mar 24, 8:44 PM.

Details

Summary

We can legalize the operation MUL for v8i16 with instruction (vmladduhm A, B, 0) if altivec enabled. Now, it is set as custom and expand it later, which is not the right way. And then, we can add the pattern to match the mul + add with (vmladduhm A, B, C)

Diff Detail

Event Timeline

steven.zhang created this revision.Tue, Mar 24, 8:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptTue, Mar 24, 8:44 PM
nemanjai requested changes to this revision.Wed, Mar 25, 3:23 AM

Thanks for putting this up. This is actually something I meant to do very soon as we need it for a specific benchmark. Please fix the pattern and then this patch is fine.

Demonstration that the pattern is backwards:

$ cat def.c 
vector unsigned short test(vector unsigned short A, vector unsigned short B,
                           vector unsigned short C) {
  return A + B * C;
}

$ cat use.c 
#include <stdio.h>
#include <stdlib.h>
vector unsigned short test(vector unsigned short, vector unsigned short,
                           vector unsigned short);
int main(int argc, const char **argv) {
  unsigned short A = atoi(argv[1]);
  unsigned short B = atoi(argv[2]);
  unsigned short C = atoi(argv[3]);

  vector unsigned short Res =
      test((vector unsigned short)A, (vector unsigned short)B,
           (vector unsigned short)C);
  printf("Res: { %hu, %hu, %hu, %hu, %hu, %hu, %hu, %hu }\n", Res[0], Res[1],
         Res[2], Res[3], Res[4], Res[5], Res[6], Res[7]);
  return 0;
}

$ ./a.out 4 3 2
Res: { 11, 11, 11, 11, 11, 11, 11, 11 }

# correct value:
$ ./correct 4 3 2
Res: { 10, 10, 10, 10, 10, 10, 10, 10 }
llvm/lib/Target/PowerPC/PPCInstrAltivec.td
876

Huh? Is this right? Doesn't the instruction do $vA * $vB + $vC? So wouldn't that mean that we need the input pattern to be (add v8i16:$vC, (mul v8i16:$vA, v8i16:$vB))? Or the output to be (VMLADDUHM $vB, $vC, $vA) if we are leaving the input the same.

This revision now requires changes to proceed.Wed, Mar 25, 3:23 AM
steven.zhang marked an inline comment as done.Wed, Mar 25, 4:02 AM
steven.zhang added inline comments.
llvm/lib/Target/PowerPC/PPCInstrAltivec.td
876

Oops! You are right.

Fix the pattern issue.

nemanjai accepted this revision.Wed, Mar 25, 10:37 AM

LGTM. Thanks.

This revision is now accepted and ready to land.Wed, Mar 25, 10:37 AM
This revision was automatically updated to reflect the committed changes.