This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Complete the custom legalization of vector int to fp conversion
ClosedPublic

Authored by nemanjai on Nov 16 2018, 9:51 PM.

Details

Summary

A recent patch has added custom legalization of vector conversions of v2i16 -> v2f64. This just rounds it out for other types where the input vector has an illegal (narrower) type than the result vector. Specifically, this will handle the following conversions:

v2i8 -> v2f64
v4i8 -> v4f32
v4i16 -> v4f32

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai created this revision.Nov 16 2018, 9:51 PM
nemanjai updated this revision to Diff 174497.Nov 16 2018, 10:38 PM
nemanjai edited the summary of this revision. (Show Details)

Actually, the handling for v2i32 is already optimal due to the DAG combine that handles this. We'll leave it in the DAG combine since that handles extractions even if they didn't come from legalizing v2i32 types (i.e. it handles a superset of the legalization added here).

The "recent patch" mentioned in the description is https://reviews.llvm.org/D53346.

Nice! Glad to see this stuff get completed.

lib/Target/PowerPC/PPCISelLowering.cpp
7333 ↗(On Diff #174497)

When I looked at this before, I found that using SIGN_EXTEND_INREG led to a perm/shift left/shift right pattern, and I thought that using unpack would be faster for the signed int/pwr8 case.

test/CodeGen/PowerPC/vec_conv_i16_to_fp32_elts.ll
340 ↗(On Diff #174497)

I think using unpack would be faster than perm/vslw/vsraw, particularly for the f32 case where one level of unpack would do.

jsji added inline comments.Nov 20 2018, 3:00 PM
test/CodeGen/PowerPC/vsx.ll
1096 ↗(On Diff #174497)

Should we remove this comment now?

jsji added inline comments.Nov 20 2018, 3:04 PM
test/CodeGen/PowerPC/vec_conv_i16_to_fp32_elts.ll
83 ↗(On Diff #174497)

Shall we check the data in .LCPI1_0 to make sure the permute control is correct?

nemanjai added inline comments.Nov 20 2018, 6:35 PM
lib/Target/PowerPC/PPCISelLowering.cpp
7333 ↗(On Diff #174497)

Ah cool, I didn't think of that. You're right an unpack is definitely useful for something like this (unpack low for LE and high for BE).
However, I think that would be good to do irrespective of this and in a follow-up patch.
What I'm getting at is that however the pertinent pattern such as:
sign_extend_inreg (vector_shuffle ...)
came into existence, it's good to emit optimal code for it.

test/CodeGen/PowerPC/vec_conv_i16_to_fp32_elts.ll
83 ↗(On Diff #174497)

I think that doing it for all of them would be time consuming without a particularly good justification. Furthermore, these check patterns were produced by the script so if anyone updates anything about the codegen in the future and re-runs the script, those checks would be lost. Perhaps I could do it for one only, but that seems rather arbitrary.

If you insist, I'll certainly add it.

test/CodeGen/PowerPC/vsx.ll
1096 ↗(On Diff #174497)

Good point. I'll do that.

jsji added inline comments.Nov 20 2018, 7:18 PM
test/CodeGen/PowerPC/vec_conv_i16_to_fp32_elts.ll
83 ↗(On Diff #174497)

Make sense, that is fine, I'm OK with skipping them . Thanks.

RolandF added inline comments.Nov 21 2018, 3:19 PM
lib/Target/PowerPC/PPCISelLowering.cpp
7333 ↗(On Diff #174497)

Sure, makes sense.

Please let me know if there are any further comments from anyone. Otherwise, we can probably proceed with this.

If this gets approved and committed before https://reviews.llvm.org/D54906, I need to remember to add a comment to the other review.

D54906 has been committed as r347593. This patch will need to be rebased.

jsji accepted this revision.Dec 3 2018, 12:34 PM

LGTM, unless @RolandF has other comments.

This revision is now accepted and ready to land.Dec 3 2018, 12:34 PM
This revision was automatically updated to reflect the committed changes.