This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Materialize floats in the range [-16.0, 15.0].
ClosedPublic

Authored by stefanp on Nov 28 2022, 11:41 AM.

Details

Summary

Previous to this patch we only materialized 0.0 and all other floating point
values would be loaded from the TOC. This patch adds materialization for the
floating point values that can be represented as integers in [-16.0, 15.0].

For example we will now materialize 3.0 and -5.0 but not 4.7.

Diff Detail

Event Timeline

stefanp created this revision.Nov 28 2022, 11:41 AM
stefanp requested review of this revision.Nov 28 2022, 11:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 28 2022, 11:41 AM
stefanp updated this revision to Diff 478305.Nov 28 2022, 11:44 AM

Forgot to do a clang-format.
Updated patch with that.

stefanp added a reviewer: Restricted Project.Nov 28 2022, 11:45 AM

Do we already handle vector constants that are splats (i.e. { 1.0f, 1.0f, 1.0f, 1.0f } or { 3.0, 3.0 })? If not, do we plan to?

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
1348

Why do we need custom selection code here? Can't we just implement this in the .td files similarly to how we implemented lowering for Power10 (see nzFPImmAsi32 and getFPAs32BitInt). We just need a version of nzFPImmAsi32 that checks for exact conversion and the result being in the expected range.

1367–1368

This looks wrong. This will produce a pair of 32-bit single precision values in the FP portion of the VSX register. What you want is a 64-bit double precision value in each doubleword.

Don't forget that single precision values on PPC are in registers as double precison but rounded to single precision (i.e. a double precision representation of a single precision value).

I think you should use scalar instructions for scalar values (xscvsxdsp, xscvsxddp).

stefanp updated this revision to Diff 478945.Nov 30 2022, 7:24 AM

Moved the constant materialization to the td file.

Do we already handle vector constants that are splats (i.e. { 1.0f, 1.0f, 1.0f, 1.0f } or { 3.0, 3.0 })? If not, do we plan to?

We don't handle this yet but we should. I think this would be a separate patch though.

There are some tests failing with this patch. I'm looking into them now...

stefanp updated this revision to Diff 479445.Dec 1 2022, 2:51 PM

Fixed the patch and a number of test cases.

stefanp updated this revision to Diff 479447.Dec 1 2022, 2:54 PM

Removed unnecessary whitespace change.

I would like to address the following comment because it is not at all obvious why I decided not to go with this approach.

This looks wrong. This will produce a pair of 32-bit single precision values in the FP portion of the VSX register. What you want is a 64-bit double precision value in each doubleword.
Don't forget that single precision values on PPC are in registers as double precison but rounded to single precision (i.e. a double precision representation of a single precision value).
I think you should use scalar instructions for scalar values (xscvsxdsp, xscvsxddp).

The instruction that does the integer splat is vspltisw which puts the same copy of the integer in every word of the vector. The scalar instructions xscvsxdsp, xscvsxddp look at double word entries and not word entries so they cannot be used to convert the word integers into floats.
There is no double word version of an instruction like`vspltisw` to go with xscvsxdsp, xscvsxddp. Likewise, to my knowledge, there is no word instruction for the scalar convert.
To get around this issue the vector word convert was used xvcvsxwdp. This works for both double and single precision because we know exactly which constants we are representing and we know that they can be exactly represented in both float and double.

lei added inline comments.Dec 20 2022, 11:14 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
17182

Previous behavour is "fallthrough" which will result in a return of Imm.isPosZero(). Do we not want the defalut to be same as before?

stefanp added inline comments.Dec 21 2022, 8:20 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
17182

Previous behavour is "fallthrough" which will result in a return of Imm.isPosZero(). Do we not want the defalut to be same as before?

Actually we can now handle negative zero as well for the types f32 and f64 due to this:

def : Pat<(f64 (fpimm0neg)),
          (f64 (XSNEGDP (XXLXORdpz)))>;

def : Pat<(f32 (fpimm0neg)),
          (f32 (COPY_TO_REGCLASS (XSNEGDP (XXLXORdpz)), VSSRC))>;

So, we can safely check for all zeros instead of just positive zero.

lei accepted this revision as: lei.Dec 21 2022, 11:25 AM

Thanks! This LGTM

This revision is now accepted and ready to land.Dec 21 2022, 11:25 AM
nemanjai accepted this revision.Jan 4 2023, 6:45 AM

LGTM.

stefanp updated this revision to Diff 486343.Jan 4 2023, 10:50 AM

Rebased to top of main branch.

This revision was landed with ongoing or failed builds.Jan 4 2023, 10:52 AM
This revision was automatically updated to reflect the committed changes.