This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Remove extra swap for extract+vperm on LE
ClosedPublic

Authored by qiucf on Apr 30 2021, 1:31 AM.

Details

Reviewers
nemanjai
jsji
Group Reviewers
Restricted Project
Commits
rGf7294ac8093a: [PowerPC] Remove extra swap for extract+vperm on LE
Summary

This is a simple fix on LE. On BE, vector shuffles are categorized into different ops. We may need more work to eliminate these in tablegen/pre-isel.

Diff Detail

Event Timeline

qiucf created this revision.Apr 30 2021, 1:31 AM
qiucf requested review of this revision.Apr 30 2021, 1:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2021, 1:31 AM
nemanjai accepted this revision.Apr 30 2021, 6:19 AM

LGTM.

llvm/lib/Target/PowerPC/PPCInstrVSX.td
2957

Nit: get rid of the unrelated change.

This revision is now accepted and ready to land.Apr 30 2021, 6:19 AM
This revision was automatically updated to reflect the committed changes.

This patch is functionally incorrect

one.c:

#include <stdio.h>
extern double test10(vector int a, vector int b);
int main() {
  double res;
                 //   0           1           2           3
  vector int a = {0x404c38d4, 0x40460e14, 0x404c38d4, 0x7ae147ae};
                 //   4           5           6           7
  vector int b = {0x4027fae1, 0xfdf3b646, 0x47ae147b, 0x40563851};
  res = test10(a, b);
  printf("res: %f\n", res);
  return 0;
}

two.c:

double test10(vector int a, vector int b) {
  //vector int c = __builtin_shufflevector(a, b, 5, 2, 3, 7);
  //                5     2     3     7
  vector int c = { b[1], a[2], a[3], b[3] };
  return ((vector double)c)[0] + 11.0;
}

Expected result:
res: 67.444000

Actual result:
res: -45563434706068069391700044011884519815891525610042006390324289980530809059647691943421443695410413568.000000

So we have decided to revert it.

Herald added a project: Restricted Project. · View Herald TranscriptAug 25 2022, 7:33 AM
qiucf added a comment.Aug 26 2022, 2:08 AM

This patch is functionally incorrect

one.c:

#include <stdio.h>
extern double test10(vector int a, vector int b);
int main() {
  double res;
                 //   0           1           2           3
  vector int a = {0x404c38d4, 0x40460e14, 0x404c38d4, 0x7ae147ae};
                 //   4           5           6           7
  vector int b = {0x4027fae1, 0xfdf3b646, 0x47ae147b, 0x40563851};
  res = test10(a, b);
  printf("res: %f\n", res);
  return 0;
}

two.c:

double test10(vector int a, vector int b) {
  //vector int c = __builtin_shufflevector(a, b, 5, 2, 3, 7);
  //                5     2     3     7
  vector int c = { b[1], a[2], a[3], b[3] };
  return ((vector double)c)[0] + 11.0;
}

Expected result:
res: 67.444000

Actual result:
res: -45563434706068069391700044011884519815891525610042006390324289980530809059647691943421443695410413568.000000

So we have decided to revert it.

Thanks for catching this! The pattern should only work in a very limited condition, which is not profitable.