This patch adds support for the direct move instructions to and from VSRs that can be used for converting floating point to fixed point values and vice versa without the slow load/store combinations to the same memory locations.
For now, exploitation is limited to explicit conversions between floating/fixed point values and only on byte/halfword/word/doubleword sizes.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Some style issues, and some fragile test-case concerns.
lib/Target/PowerPC/PPCFastISel.cpp | ||
---|---|---|
962 | Obligatory "add a period" comment. | |
1072 | Likewise. | |
lib/Target/PowerPC/PPCISelLowering.cpp | ||
5954 | Missing space before left parenthesis. | |
6105 | Space before parenthesis. | |
lib/Target/PowerPC/PPCInstrVSX.td | ||
992 | In all of the above, please indent the overflow lines so that the first character is underneath the character following the "<" character (as you did for MFVSRWZ, but for none of the others). | |
test/CodeGen/PowerPC/fp-int-conversions-direct-moves.ll | ||
16 | Specifying the intermediate register number (0) makes the test case more likely to break in the future. Please use {{[0-9]+}} so the test isn't reliant on specific register allocation. The 1 and 3 are ok because they are ABI registers we expect to use. This applies to all the tests here. | |
28 | Here the result register of the xscvuxddp is not an ABI reg so should also use a regexp, not a specific reg. (Also, why didn't it end up in float reg 1?) | |
80 | Same concerns with result reg. | |
132 | And again. | |
184 | And again. | |
236 | And again. | |
288 | And again. | |
427 | Please remove the attributes and metadata. Delete these trailing lines, and remove the #0 from the function definitions. |
Couple of inline questions, please do fix up all of the comments to be complete sentences. Bill looks like he has the rest of the correctness issues.
Thanks!
-eric
lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
5917 | Block comment before the function please. | |
test/CodeGen/PowerPC/fp-int-conversions-direct-moves.ll | ||
2–3 | Can probably just use -unknown-unknown here as the OS part of the triple? | |
test/CodeGen/PowerPC/stfiwx.ll | ||
1–2 ↗ | (On Diff #23501) | How is direct-move getting turned on? I thought it was power8 only? |
lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
6069 | This is going away in the patch that I'm about to upload. Thanks Bill for catching the register number issue that led to the realization that we have this unnecessary rounding. | |
lib/Target/PowerPC/PPCInstrVSX.td | ||
992 | Ugh, I don't know how I miss these things - it is so obviously misaligned. Thanks for pointing it out. Fixed and will be part of the next revision. | |
test/CodeGen/PowerPC/fp-int-conversions-direct-moves.ll | ||
2–3 | Yup, I don't see why not. Will do. Thanks for the comment. | |
16 | I will turn these into a regular expressions. Thanks for the tip. | |
28 | I'll turn these into regular expressions if you would like. To answer the question in parentheses, it does not end up in float reg 1 because it is followed by an frsp since we need to round it to single precision. I will change the custom lowering to use PPCISD::FCFID[U]S when converting from any integral value to a single-precision float. I replicated the existing logic which in retrospect does not seem sound (actually I just realized that existing logic only rounds if there is no FPCVT). It uses the SDAG nodes for rounding directly to single-precision only when we are converting i64 to f32 but not for other integral types. Since we are now assuming hasFPCVT for the entire function, there is no harm in refactoring this to skip the need for the extra rounding instruction. Wow, I am really glad you pointed this out since I didn't really think about why FPR 1 was not used. Thanks. | |
test/CodeGen/PowerPC/stfiwx.ll | ||
1–2 ↗ | (On Diff #23501) | I was running on a Power 8 system and this failed. However, I believe the assumption of a default CPU may have been removed in a newer revision than the one I was modifying for this patch. I will investigate if this can safely be removed. |
Addressed the comments from Bill and Eric. Notable changes:
- we use conversions directly to single-precision when converting to f32 (rather than converting to f64 and subsequently rounding)
- test case now does not specify a specific register except where required by the ABI (instead a regex is used)
- formatting issues fixed
- removed the unnecessary change to stfiwx.ll test case (since no longer required)
lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
6060 | This comment is now redundant. I forgot to remove it prior to uploading the patch. It is removed from my source tree so it won't show up in the final commit (or the next review if one is necessary). |
test/CodeGen/PowerPC/fp-int-conversions-direct-moves.ll | ||
---|---|---|
13 | The regular expression will be changed from {{[0-9]+}} to [[CONV-REG:[0-9]+]] for the first occurrence of the register and to [[CONV-REG]] for the second occurrence. This is to ensure that the destination register for what we convert is the same as the source register for the move instruction. |
With the issues you've already promised to fix, and with one nit I noticed, LGTM.
One question: I was surprised to see that XSCVUXDSP is unimplemented. Do we have a work item open to address that?
Thanks,
Bill
lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
6104 | Oh look, another missing period! 3.7 demerits. |
Obligatory "add a period" comment.