Page MenuHomePhabricator

[X86] Add ExpandLargeFpConvert Pass and enable for X86
ClosedPublic

Authored by FreddyYe on Nov 2 2022, 3:51 AM.

Details

Summary

As stated in
https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528,
this implementation is very similar to ExpandLargeDivRem, which expands
‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions
with a bitwidth above a threshold into auto-generated functions. This is
useful for targets like x86_64 that cannot lower fp convertions with more
than 128 bits. The expanded nodes are referring from the IR generated by
compiler-rt/lib/builtins/floattidf.c, compiler-rt/lib/builtins/fixdfti.c,
and etc.

Corner cases:

  1. For fp16: as there is no related builtins added in compliler-rt. So I

mainly utilized the fp32 <-> fp16 lib calls to implement.

  1. For fp80: as this pass is soft fp emulation and no fp80 instructions can

help in this problem. I recommend users to deprecate this usage. For now, the
implementation uses fp128 as the temporary conversion type and inserts
fptrunc/ext at top/end of the function.

  1. For bf16: as clang FE currently doesn't support bf16 algorithm operations

(convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for
now.

  1. For unsigned FPToI: since both default hardware behaviors and libgcc are

ignoring "returns 0 for negative input" spec. This pass follows this old way
to ignore unsigned FPToI. See this example:
https://gcc.godbolt.org/z/bnv3jqW1M

The end-to-end tests are uploaded at https://reviews.llvm.org/D138261

Diff Detail

Event Timeline

FreddyYe created this revision.Nov 2 2022, 3:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2022, 3:51 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
FreddyYe requested review of this revision.Nov 2 2022, 3:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2022, 3:51 AM
RKSimon added a subscriber: RKSimon.Nov 2 2022, 4:04 AM
FreddyYe updated this revision to Diff 474180.Nov 9 2022, 12:43 AM

WIP update.

FreddyYe updated this revision to Diff 476355.Nov 17 2022, 11:44 PM

Complete the imple and split tests into another Phab.

FreddyYe edited the summary of this revision. (Show Details)Nov 17 2022, 11:44 PM
FreddyYe retitled this revision from [WIP] Add ExpandLargeFpConvert Pass to [WIP][X86] Add ExpandLargeFpConvert Pass and enable for X86.Nov 17 2022, 11:56 PM
FreddyYe edited the summary of this revision. (Show Details)
FreddyYe added a reviewer: pengfei.
FreddyYe edited the summary of this revision. (Show Details)
FreddyYe edited the summary of this revision. (Show Details)Nov 17 2022, 11:59 PM
FreddyYe edited the summary of this revision. (Show Details)
FreddyYe edited the summary of this revision. (Show Details)Nov 18 2022, 2:01 AM
FreddyYe updated this revision to Diff 477076.Nov 21 2022, 11:41 PM

Support unsigned expandIToFP() and add IR comments as refer.

FreddyYe retitled this revision from [WIP][X86] Add ExpandLargeFpConvert Pass and enable for X86 to [X86] Add ExpandLargeFpConvert Pass and enable for X86.Nov 21 2022, 11:49 PM
FreddyYe edited the summary of this revision. (Show Details)
FreddyYe edited the summary of this revision. (Show Details)

Great that you worked on this!
I will have a deeper look at the end of this week.

LuoYuanke added inline comments.Nov 26 2022, 12:34 AM
llvm/test/CodeGen/X86/expand-large-fp-convert-fptoui129.ll
11

fptoui?

LuoYuanke added inline comments.Nov 27 2022, 11:20 PM
llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
76

It seems we just truncate the value. Is it because it is rounded to zero?

103

Maybe more readable with FloatVal->getType()->isHalfTy().

107

Add comments "FPToSI"?

119

PowerOf2Ceil(FPMantissaWidth)?

121

The first letter should be capital for variable name.

123

Ditto.

llvm/test/CodeGen/AArch64/O0-pipeline.ll
19

Not sure if it is better to merge convert and div/rem into one pass.

FreddyYe updated this revision to Diff 478518.Nov 29 2022, 4:19 AM
FreddyYe marked 7 inline comments as done.

Address comments. THX for review!

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
76

Yes, this is default behavior that codes like (int)3.14 outputs 3 no matter what current rounding mode is. LLVM langref also defines so for fptosi: https://llvm.org/docs/LangRef.html#id269

103

Good idea.

mgehre-amd added inline comments.Nov 29 2022, 5:22 AM
llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
91

The return value is unused at its call site - convert to void?

234

nit: word missing after "implicit"
And can we have an assert for that assumption?

314

return void

mgehre-amd accepted this revision.Nov 29 2022, 5:29 AM

I'm scared about missing some subtle issue in the implementation, but I think the test cases on the other PR should cover that.

This revision is now accepted and ready to land.Nov 29 2022, 5:29 AM
LuoYuanke added inline comments.Nov 29 2022, 5:43 AM
llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
146

FloatVal->getType()->isX86_FP80Ty()?

151

Do we assume the _BitInt size is larger than 128-bit?

152

ToBoolNot?

159

ExponentWithBias.

165

1u << (ExponentWidth - 1)) - 1 expression is used several times. Maybe to set Bias = 1u << (ExponentWidth - 1)) - 1 to be more readable.

180

NegOne

182

NegInf

185

PosInf

219

Retval0

LuoYuanke added inline comments.Nov 29 2022, 5:47 AM
llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
152

Maybe rename it as PosOrNeg?

FreddyYe updated this revision to Diff 478809.Nov 29 2022, 10:53 PM
FreddyYe marked 13 inline comments as done.

Address comments. THX for review!

FreddyYe added inline comments.Nov 29 2022, 10:57 PM
llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
151

Yes for x86.

609:      if (IntTy->getIntegerBitWidth() <= MaxLegalFpConvertBitWidth)
610:        continue;
234

Yes, that is meaningful for debug with -mllvm expand-fp-convert-bits. I'll add.
Generally, since integer width <= i128 are always lowered to libcall, the smallest integer type entering this pass is i129, whose width is larger than fp128.

LuoYuanke added inline comments.Nov 30 2022, 5:11 AM
llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
312

Add comments to indicate that fp80 is extended to fp128?

317

IsSigned

325

Move to line 491 to avoid dead instruction when it is not fp80?

327

Ditto

329

Ditto

390

It seems there is many coding style like this. Is it possible that create dead code? However I think they will eventually be deleted in ISel.

FreddyYe updated this revision to Diff 479127.Nov 30 2022, 5:46 PM
FreddyYe marked 6 inline comments as done.

Address comments. THX for review!

FreddyYe added inline comments.Nov 30 2022, 5:47 PM
llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
390

There may be some dead codes I don't know here. Though ISel will delete the dead nodes, I will try to delete such codes. I removed two in the next version. You can comment out if your find more.
For Shl here, I think it's not dead code since it is always be used by AAddr0.

FreddyYe edited the summary of this revision. (Show Details)Nov 30 2022, 9:01 PM
FreddyYe edited the summary of this revision. (Show Details)Nov 30 2022, 9:33 PM
This revision was landed with ongoing or failed builds.Nov 30 2022, 9:48 PM
This revision was automatically updated to reflect the committed changes.

I think this was the last reason for restricting _BitInt to <= 128 by default in clang. Are you planning to create a PR to lift that restriction now?

I think this was the last reason for restricting _BitInt to <= 128 by default in clang. Are you planning to create a PR to lift that restriction now?

Yes, I agree. The patch of tests also relies on lifting first. But I'm not sure if https://github.com/llvm/llvm-project/blob/450de8008bb0ccb5dfc9dd69b6f5b434158772bd/clang/include/clang/Basic/TargetInfo.h#L637 is the only place needs to change. @aaron.ballman WDYT?

I think this was the last reason for restricting _BitInt to <= 128 by default in clang. Are you planning to create a PR to lift that restriction now?

Yes, I agree. The patch of tests also relies on lifting first. But I'm not sure if https://github.com/llvm/llvm-project/blob/450de8008bb0ccb5dfc9dd69b6f5b434158772bd/clang/include/clang/Basic/TargetInfo.h#L637 is the only place needs to change. @aaron.ballman WDYT?

Do *all* targets support > 128 now, or just x86 targets? If all targets support > 128, then that's the place to update (and we can consider starting to rip some of the target-specific machinery and command line option out). But if it's just x86, then we should override this function in the correct derived TargetInfo class.

I think this was the last reason for restricting _BitInt to <= 128 by default in clang. Are you planning to create a PR to lift that restriction now?

Yes, I agree. The patch of tests also relies on lifting first. But I'm not sure if https://github.com/llvm/llvm-project/blob/450de8008bb0ccb5dfc9dd69b6f5b434158772bd/clang/include/clang/Basic/TargetInfo.h#L637 is the only place needs to change. @aaron.ballman WDYT?

Do *all* targets support > 128 now, or just x86 targets? If all targets support > 128, then that's the place to update (and we can consider starting to rip some of the target-specific machinery and command line option out). But if it's just x86, then we should override this function in the correct derived TargetInfo class.

I see. Thanks! I created https://reviews.llvm.org/D139170, I'll add both of you to review list. Please help review.