Page MenuHomePhabricator

[FPEnv] [PowerPC] Lower ppc_fp128 StrictFP Nodes to libcalls
Needs ReviewPublic

Authored by ajwock on Jul 12 2019, 1:00 PM.

Details

Summary

ppc_libcalls

This change allows strictfp nodes with the type ppc_fp128 to be lowered to libcalls in the same manner that regular ppc_fp128 opcodes are. Unlike regular opcodes, whose expansions chain to the entry node and typically leave a dangling chain, these libcalls use the chain input of the original strictfp node and their chain outputs are used by the users of the original node.

A couple of opcodes are not lowered to libcalls- STRICT_FP_ROUND and STRICT_FP_EXTEND. Since the ppc_fp128 type is an f64 pair, truncation or extension from/to the ppc_fp128 type requires disposing one of the doubles or pairing with a new one. The code behind these also parallels normal opcodes, with a little chain finangling.

One caveat to this change is that hardware looping in the IR optimization stage does not know to expect these expansions yet, particularly when checking to see if the intrinsic will clobber the CTR register. I have a fix for this locally to be added later.

Diff Detail

Repository
rL LLVM

Event Timeline

ajwock created this revision.Jul 12 2019, 1:00 PM

Sorry Hal, I must have accidentally released an incomplete draft yesterday instead of simply deleting it like I meant to.

Sorry Hal, I must have accidentally released an incomplete draft yesterday instead of simply deleting it like I meant to.

Can you please upload a patch with full context?

ajwock updated this revision to Diff 209923.Jul 15 2019, 11:34 AM

Full context.

kbarton requested changes to this revision.Aug 27 2019, 9:19 AM
kbarton added inline comments.
include/llvm/CodeGen/TargetLowering.h
2969 ↗(On Diff #209923)

Is there a specific reason you used the _ in the function name?
I think with camel case this can be makeLibCallChained, which is still easy to read and follows the coding guidelines.

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1265

Similar comment here with the choice of _STRICT_ vs Strict_ (i.e., ExpandFloatResStrict_FMINNUM(...)).
I think the use of Strict_ follows the coding guidelines, while _STRICT_ deviates from it.

1367

Was this formatted with clang-format?

2064

Can you put parenthesis around this to make it clear the order of evaluation between the && and ||?

lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
1029 ↗(On Diff #209923)

Can you please make this into a doxygen comment .

1031 ↗(On Diff #209923)

Same comment about the use of underscores. I think this should be LibCallifyStrictFP.

lib/CodeGen/SelectionDAG/TargetLowering.cpp
160 ↗(On Diff #209923)

This should be makeLibCallChained

test/CodeGen/PowerPC/ppcf128-constrained-fp-instrinsics.ll
7 ↗(On Diff #209923)

Were these checks automatically generated by the python script?
It seems unusual that they are placed in the middle of the parameter list for the test case. I thought they usually got put immediately before the function.

12 ↗(On Diff #209923)

Others may disagree with me, but I would recommend getting rid of the checks for cfi here.
I don't think the location of the cfi has any bearing on the correctness here. I also think the compiler should be free to move them with respect to the std and stdu instructions. If it does so, this test case will fail.

This revision now requires changes to proceed.Aug 27 2019, 9:19 AM

Sorry, I clicked submit before adding general comments.
Thanks for working on this. I think this looks OK, just needs some minor cleaning wrt coding guidelines and the test cases.
I'd like other people weigh in though, as it's been a while since I looked at SD Chains.

kbarton added inline comments.Aug 27 2019, 11:39 AM
test/CodeGen/PowerPC/ppcf128-constrained-fp-instrinsics.ll
4 ↗(On Diff #209923)

I also just realized we don't have any 32-bit build bots anymore.
Can you please change this to powerpc64-linux-gnu?

ajwock updated this revision to Diff 217711.Aug 28 2019, 1:21 PM

Variety of adjustments, see comment replies (still incoming if I just submitted this)

ajwock marked 9 inline comments as done.Aug 28 2019, 1:27 PM
ajwock added inline comments.
include/llvm/CodeGen/TargetLowering.h
2969 ↗(On Diff #209923)

Fixed. This was matching how I added strict to other names, but it is not necessary.

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1265

So, each of these functions capitalized names corresponds to an instruction opcode. In order to stay consistent with the rest of these functions, I have simply named these functions after their opcode rather than arbitrarily prepending _STRICT.

I believe that if I did not name these functions as such, it would either be inconsistent with existent code or require significant refactoring.

1367

Fixed? Reformatted with clang-format.

2064

Fixed, and switched order of evaluation to short circuit on the common case.

lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
1031 ↗(On Diff #209923)

Fixed.

lib/CodeGen/SelectionDAG/TargetLowering.cpp
160 ↗(On Diff #209923)

Fixed.

test/CodeGen/PowerPC/ppcf128-constrained-fp-instrinsics.ll
4 ↗(On Diff #209923)

Fixed.

7 ↗(On Diff #209923)

Semi-fixed: They were- though as far as I've seen these checks are typically placed right below the function header. I can change this if needed.

12 ↗(On Diff #209923)

Alright- loosened -NEXT restrictions around std/u instructions.

ajwock marked 2 inline comments as not done.Aug 28 2019, 1:28 PM
jsji added inline comments.Aug 28 2019, 2:42 PM
lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1701

Add comments about why this is special here, not lowering to calls. I know you have description in summary,
but it would be great to add comment here, so that readers don't have to look into commit message or patches.

2039

Any reason that we are expanding STRICT_FP_ROUND in ExpandFloatOperand, while all the others, especially STRICT_FP_EXTEND in ExpandFloatResult?

2142

Add comments about why this is special here, not lowering to calls too.

lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
1039 ↗(On Diff #209923)

Can we just reuse ExpandChainLibCall below?

lib/CodeGen/SelectionDAG/TargetLowering.cpp
162 ↗(On Diff #209923)

If we are going to keep this API, should use MakeLibCallOptions introduced in https://reviews.llvm.org/D65795 instead.

181 ↗(On Diff #209923)

Most of these code are command and copied from makeLibCall, can we maintain one copy only?

test/CodeGen/PowerPC/ppcf128-constrained-fp-instrinsics.ll
1974 ↗(On Diff #209923)

These tests are all simple test with one call. Can we add one or more complex tests? eg: one call use the other calls res? so that we can test the input chain?

ajwock updated this revision to Diff 218141.Aug 30 2019, 11:36 AM
ajwock marked an inline comment as done.

Thanks to jsji for making me realize that a lot of the functionality I implemented in this patch already existed. I converted the ExpandFloatRes/Op_STRICT_* functions to use ExpandChainLibCall and eliminated the LibCallifyStrictFP and MakeLibCallChained functions from the previous diffs.

I also added a test that ensures that the serial properties of strictFP functions are maintained.

jsji added inline comments.Aug 30 2019, 11:37 AM
test/CodeGen/PowerPC/ppcf128-constrained-fp-instrinsics.ll
12 ↗(On Diff #209923)

You should be able to avoid checking those by adding nounwind to all these functions, you can see examples in vector-constrained-fp-intrinsics.ll in https://reviews.llvm.org/D67016.
This way, you can still use the script to generate the check automatically.

ajwock marked 3 inline comments as done.Aug 30 2019, 11:47 AM

Still have some test changes that I have to make- slight diff change incoming soon

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1701

I didn't find it necessary to add such comments- libcallification is one of the possible ways one might expand a floating point result, but it isn't something that is implied.

Another reason I didn't do this is that the author of the analog ExpandFloatRes_FP_EXTEND (not strict and right above this) didn't comment on this, and when I came across this code in my problem solving, I could reasonably intuit why FP_EXTEND wouldn't be a libcall.

I could add documentation that explains this issue with respect to both this function and the above function, but wouldn't that be more appropriate for a ticket specifically about existing documentation?

lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
1039 ↗(On Diff #209923)

Good catch. I wonder why this function was named so differently- that alone kept me from realizing it had anything to do with LibCallify.

Fixed, and removed LibCallifyStrictFP + makeLibCallChained.

ajwock updated this revision to Diff 218145.Aug 30 2019, 11:58 AM

Updated tests to have nounwind attribute so that we don't have to worry about https://reviews.llvm.org/D67016 non-relevant .cfi.

ajwock marked an inline comment as done.Aug 30 2019, 12:47 PM
ajwock added inline comments.
lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
2039

I put STRICT_FP_ROUND here for a rather simple-minded reason- FP_ROUND (non strict) is dealt with here as well.

I couldn't find any documentation or commits on the reason that ExpandFloatOperand_FP_ROUND was here, but this particular addition was required to get clang with the https://reviews.llvm.org/D43142 forceconstrainedfp pass to successfully compile and pass the GNU Scientific Library's tests that involved the long long data type on a PowerPC server.

ajwock marked an inline comment as done.Aug 30 2019, 12:49 PM
ajwock added inline comments.
lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
2039

Edit: by long long I mean long-double.

jsji added a comment.Sep 3 2019, 2:29 PM

https://reviews.llvm.org/rL370228 just landed, so I think you should include fptrunc and fpext intrinsics as well.

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
2039

I think this is because STRICT_FP_ROUND is used for fptrunc to f64,f32, the result type (f64,f32 is already legal here. However, we need to *chain* in the expansion of operands, instead of using default non-strict expansion here. -- I think that is why this expansion is required to get the tests passed.

2041

Should also ExpandFloatOp_STRICT here for STRICT_FP_TO_SINT/STRICT_FP_TO_UINT.

test/CodeGen/PowerPC/ppcf128-constrained-fp-instrinsics.ll
3 ↗(On Diff #218145)

Most of the P8 LE and P9 LE code are actually common, I believe you can simplify the file using check-prefixes.

; RUN: llc -O3 -mtriple=powerpc64le-linux-gnu < %s | FileCheck --check-prefixes=LECOMM,PC64LE %s
; RUN: llc -O3 -mtriple=powerpc64le-linux-gnu -mcpu=pwr9 < %s | FileCheck --check-prefixes=LECOMM,PC64LE9 %s

then rerun the script should reduce the file.

jsji added inline comments.Sep 3 2019, 2:32 PM
lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1701

Fair enough, maybe a NFC patch for documenting these in the future. Thanks.

ajwock updated this revision to Diff 219160.Sep 6 2019, 1:18 PM

Added libcall lowering for fptosi and fptoui constrained intrinsics. They take a similar lowering route to STRICT_FP_ROUND.

jsji accepted this revision.Sep 11 2019, 8:36 AM

LGTM. May need @kbarton to have another look, as he should be in the "Must Review" for this.

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1413

Unnecessary extra newline.

2064

Is this formatted by clang-format? Extra space at the end of line here.

ajwock updated this revision to Diff 219759.Sep 11 2019, 12:04 PM

Ensured that all changes to LLVM source are clang-formatted.