This is an archive of the discontinued LLVM Phabricator instance.

Add direct moves to/from VSR and exploit them for FP/INT conversions
ClosedPublic

Authored by nemanjai on Apr 9 2015, 10:39 AM.

Details

Summary

This patch adds support for the direct move instructions to and from VSRs that can be used for converting floating point to fixed point values and vice versa without the slow load/store combinations to the same memory locations.
For now, exploitation is limited to explicit conversions between floating/fixed point values and only on byte/halfword/word/doubleword sizes.

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai updated this revision to Diff 23501.Apr 9 2015, 10:39 AM
nemanjai retitled this revision from to Add direct moves to/from VSR and exploit them for FP/INT conversions.
nemanjai updated this object.
nemanjai edited the test plan for this revision. (Show Details)
nemanjai set the repository for this revision to rL LLVM.
nemanjai added a subscriber: Unknown Object (MLST).
nemanjai added inline comments.Apr 9 2015, 10:41 AM
lib/Target/PowerPC/PPCInstrFormats.td
770

This brace will be moved up to where it should be in the code (already done in my source tree).

lib/Target/PowerPC/PPCInstrVSX.td
992

In the actual commit, this will have both:
HasDirectMove, HasVSX

to indicate what the brace terminates.

wschmidt edited edge metadata.Apr 9 2015, 12:39 PM

Some style issues, and some fragile test-case concerns.

lib/Target/PowerPC/PPCFastISel.cpp
962

Obligatory "add a period" comment.

1072

Likewise.

lib/Target/PowerPC/PPCISelLowering.cpp
5954

Missing space before left parenthesis.

6105

Space before parenthesis.

lib/Target/PowerPC/PPCInstrVSX.td
992

In all of the above, please indent the overflow lines so that the first character is underneath the character following the "<" character (as you did for MFVSRWZ, but for none of the others).

test/CodeGen/PowerPC/fp-int-conversions-direct-moves.ll
16

Specifying the intermediate register number (0) makes the test case more likely to break in the future. Please use {{[0-9]+}} so the test isn't reliant on specific register allocation. The 1 and 3 are ok because they are ABI registers we expect to use. This applies to all the tests here.

28

Here the result register of the xscvuxddp is not an ABI reg so should also use a regexp, not a specific reg. (Also, why didn't it end up in float reg 1?)

80

Same concerns with result reg.

132

And again.

184

And again.

236

And again.

288

And again.

427

Please remove the attributes and metadata. Delete these trailing lines, and remove the #0 from the function definitions.

echristo edited edge metadata.Apr 9 2015, 12:53 PM

Couple of inline questions, please do fix up all of the comments to be complete sentences. Bill looks like he has the rest of the correctness issues.

Thanks!

-eric

lib/Target/PowerPC/PPCISelLowering.cpp
5917

Block comment before the function please.

test/CodeGen/PowerPC/fp-int-conversions-direct-moves.ll
2–3

Can probably just use -unknown-unknown here as the OS part of the triple?

test/CodeGen/PowerPC/stfiwx.ll
1–2 ↗(On Diff #23501)

How is direct-move getting turned on? I thought it was power8 only?

nemanjai added inline comments.Apr 9 2015, 8:15 PM
lib/Target/PowerPC/PPCISelLowering.cpp
6069

This is going away in the patch that I'm about to upload. Thanks Bill for catching the register number issue that led to the realization that we have this unnecessary rounding.

lib/Target/PowerPC/PPCInstrVSX.td
992

Ugh, I don't know how I miss these things - it is so obviously misaligned. Thanks for pointing it out. Fixed and will be part of the next revision.

test/CodeGen/PowerPC/fp-int-conversions-direct-moves.ll
2–3

Yup, I don't see why not. Will do. Thanks for the comment.

16

I will turn these into a regular expressions. Thanks for the tip.

28

I'll turn these into regular expressions if you would like. To answer the question in parentheses, it does not end up in float reg 1 because it is followed by an frsp since we need to round it to single precision.

I will change the custom lowering to use PPCISD::FCFID[U]S when converting from any integral value to a single-precision float. I replicated the existing logic which in retrospect does not seem sound (actually I just realized that existing logic only rounds if there is no FPCVT). It uses the SDAG nodes for rounding directly to single-precision only when we are converting i64 to f32 but not for other integral types. Since we are now assuming hasFPCVT for the entire function, there is no harm in refactoring this to skip the need for the extra rounding instruction.

Wow, I am really glad you pointed this out since I didn't really think about why FPR 1 was not used. Thanks.

test/CodeGen/PowerPC/stfiwx.ll
1–2 ↗(On Diff #23501)

I was running on a Power 8 system and this failed. However, I believe the assumption of a default CPU may have been removed in a newer revision than the one I was modifying for this patch. I will investigate if this can safely be removed.
I just ran this without the -mattr and it is OK now with a newer revision. Removed the change to this test case.

nemanjai updated this revision to Diff 23554.Apr 9 2015, 8:30 PM
nemanjai edited edge metadata.

Addressed the comments from Bill and Eric. Notable changes:

  • we use conversions directly to single-precision when converting to f32 (rather than converting to f64 and subsequently rounding)
  • test case now does not specify a specific register except where required by the ABI (instead a regex is used)
  • formatting issues fixed
  • removed the unnecessary change to stfiwx.ll test case (since no longer required)
nemanjai added inline comments.Apr 9 2015, 8:48 PM
lib/Target/PowerPC/PPCISelLowering.cpp
6060

This comment is now redundant. I forgot to remove it prior to uploading the patch. It is removed from my source tree so it won't show up in the final commit (or the next review if one is necessary).

nemanjai added inline comments.Apr 10 2015, 7:34 AM
test/CodeGen/PowerPC/fp-int-conversions-direct-moves.ll
13

The regular expression will be changed from {{[0-9]+}} to [[CONV-REG:[0-9]+]] for the first occurrence of the register and to [[CONV-REG]] for the second occurrence. This is to ensure that the destination register for what we convert is the same as the source register for the move instruction.
This applies throughout. If this turns out to be the only comment, no further review will be posted but the change will be made in the committed version.

wschmidt accepted this revision.Apr 10 2015, 10:00 AM
wschmidt edited edge metadata.

With the issues you've already promised to fix, and with one nit I noticed, LGTM.

One question: I was surprised to see that XSCVUXDSP is unimplemented. Do we have a work item open to address that?

Thanks,
Bill

lib/Target/PowerPC/PPCISelLowering.cpp
6104

Oh look, another missing period! 3.7 demerits.

This revision is now accepted and ready to land.Apr 10 2015, 10:00 AM
nemanjai closed this revision.Apr 11 2015, 3:44 AM

Committed revision 234682.