This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Improve f32 to i32 bitcast code gen
ClosedPublic

Authored by Conanap on Apr 19 2021, 11:59 AM.

Details

Reviewers
saghir
nemanjai
Group Reviewers
Restricted Project
Commits
rGdb26cd30b6dd: [PowerPC] Improve f32 to i32 bitcast code gen
Summary

The code gen for f32 to i32 bitcast is not currently the most efficient. For example:

int foo(float f) {
  return *(int*)&f;
}

Generates:

xscvdpspn vs0, f1
xxsldwi vs0, vs0, vs0, 3
mffprwz	r3, f0

However, xxsldwi is actually not needed as xscvdpspn already splats the value.

This patch removes that instruction for this specific code gen.

Diff Detail

Event Timeline

Conanap created this revision.Apr 19 2021, 11:59 AM
Conanap requested review of this revision.Apr 19 2021, 11:59 AM
nemanjai requested changes to this revision.Apr 21 2021, 5:48 AM

This will still produce a redundant XXSLDWI:

vector float test(vector float a, float b) {
  a[3] = b;
  return a;
}

when compiled with -mcpu=pwr9.

This revision now requires changes to proceed.Apr 21 2021, 5:48 AM
Conanap updated this revision to Diff 340141.Apr 23 2021, 12:40 PM

Updated to remove uncessary xrsp and other xxsldwi as well

nemanjai added inline comments.Apr 30 2021, 10:50 AM
llvm/lib/Target/PowerPC/PPCInstrVSX.td
2820

These COPY_TO_REGCLASS should probably be SUBREG_TO_REG (all of them by the looks of it).

4376

I am not really seeing the tests for these.
Can we add some tests of the form

vector float test(vector float a, double b) {
  a[1] = b;
  return a;
}
Conanap updated this revision to Diff 343115.May 5 2021, 11:26 AM
Conanap marked 2 inline comments as done.

Updated COPY_TO_REGCLASS to SUBREG_TO_REG, added a test case.

nemanjai requested changes to this revision.May 7 2021, 5:03 AM

Please add the missing BE handling.

llvm/lib/Target/PowerPC/PPCInstrVSX.td
2821

Although it is accurate that this specific instruction clears the other bits, all the other uses of SUBREG_TO_REG in the back end use (i64 1) and I don't think it is particularly useful to part with that here.

4376

These also need to be added for big endian subtargets.

llvm/test/CodeGen/PowerPC/vec_insert_elt.ll
741

Notice that this still generates the undesired code due to missing the new patterns for big endian subtargets.

This revision now requires changes to proceed.May 7 2021, 5:03 AM
Conanap updated this revision to Diff 343969.May 9 2021, 11:04 PM
Conanap marked 2 inline comments as done.

Added big-endian pattern

nemanjai accepted this revision.May 10 2021, 4:01 AM

LGTM.

This revision is now accepted and ready to land.May 10 2021, 4:01 AM
This revision was landed with ongoing or failed builds.May 31 2021, 2:01 PM
This revision was automatically updated to reflect the committed changes.