This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Improve BUILD_VECTOR of 4 i32s
ClosedPublic

Authored by lei on Oct 22 2018, 6:15 AM.

Details

Summary

Currently, for this node:

vector int test(int a, int b, int c, int d) {
  return (vector int) { a, b, c, d };
}

we get this on Power9:

mtvsrdd 34, 5, 3
mtvsrdd 35, 6, 4
vmrgow 2, 3, 2

and this on Power8:

mtvsrwz 0, 3
mtvsrwz 1, 5
mtvsrwz 2, 4
mtvsrwz 3, 6
xxmrghd 34, 1, 0
xxmrghd 35, 3, 2
vmrgow 2, 3, 2

This can be improved to this on LE Power9:

rldimi 3, 4, 32, 0
rldimi 5, 6, 32, 0
mtvsrdd 34, 5, 3

and this on LE Power8

rldimi 3, 4, 32, 0
rldimi 5, 6, 32, 0
mtvsrd 34, 3
mtvsrd 35, 5
xxpermdi 34, 35, 34, 0

This patch updates the TD pattern to generate the optimized sequence for both Power8 and Power9 on LE and BE.

Diff Detail

Repository
rL LLVM

Event Timeline

lei created this revision.Oct 22 2018, 6:15 AM
nemanjai accepted this revision.Oct 22 2018, 8:41 AM

LGTM.

This revision is now accepted and ready to land.Oct 22 2018, 8:41 AM
This revision was automatically updated to reflect the committed changes.