This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
crach.c
-
fix.c
-
fixsfsi_single_source.c
-
fixsfsi_single_source_readable.ll
-
float.c
-
floatdidf_single_source.c
-
floatdidf_single_source_readable.ll
-
llvm/
-
include/llvm/
-
llvm/
-
CodeGen/
-
MachinePassRegistry.def
-
Passes.h
-
InitializePasses.h
-
lib/CodeGen/
-
CodeGen/
-
CMakeLists.txt
-
CodeGen.cpp
30/30
ExpandLargeFpConvert.cpp
-
TargetPassConfig.cpp
-
test/Transforms/ExpandLargeFpConvert/
-
Transforms/
-
ExpandLargeFpConvert/
-
fptosi129.ll
-
si129tofp.ll
-
tools/opt/
-
opt/
-
opt.cpp

Differential D137241

[X86] Add ExpandLargeFpConvert Pass and enable for X86
ClosedPublic

Authored by FreddyYe on Nov 2 2022, 3:51 AM.

Download Raw Diff

Details

Reviewers

pengfei
LuoYuanke
skan
RKSimon
mgehre-amd
aaron.ballman

Commits

rG89f36dd8f32f: [X86] Add ExpandLargeFpConvert Pass and enable for X86

Summary

As stated in
https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528,
this implementation is very similar to ExpandLargeDivRem, which expands
‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions
with a bitwidth above a threshold into auto-generated functions. This is
useful for targets like x86_64 that cannot lower fp convertions with more
than 128 bits. The expanded nodes are referring from the IR generated by
compiler-rt/lib/builtins/floattidf.c, compiler-rt/lib/builtins/fixdfti.c,
and etc.

Corner cases:

For fp16: as there is no related builtins added in compliler-rt. So I

mainly utilized the fp32 <-> fp16 lib calls to implement.

For fp80: as this pass is soft fp emulation and no fp80 instructions can

help in this problem. I recommend users to deprecate this usage. For now, the
implementation uses fp128 as the temporary conversion type and inserts
fptrunc/ext at top/end of the function.

For bf16: as clang FE currently doesn't support bf16 algorithm operations

(convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for
now.

For unsigned FPToI: since both default hardware behaviors and libgcc are

ignoring "returns 0 for negative input" spec. This pass follows this old way
to ignore unsigned FPToI. See this example:
https://gcc.godbolt.org/z/bnv3jqW1M

The end-to-end tests are uploaded at https://reviews.llvm.org/D138261

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	2,970 ms	x64 windows > Clang.CodeGen/X86::avx512dq-builtins-constrained.c
	110 ms	x64 windows > LLVM.CodeGen/AArch64::O0-pipeline.ll
	80 ms	x64 windows > LLVM.CodeGen/AArch64::O3-pipeline.ll
	150 ms	x64 windows > LLVM.CodeGen/AArch64::aarch64-bswap-ext.ll
	190 ms	x64 windows > LLVM.CodeGen/AArch64::aarch64-vcvtfp2fxs-combine.ll
		View Full Test Results (167 Failed)

Event Timeline

FreddyYe created this revision.Nov 2 2022, 3:51 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2022, 3:51 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

FreddyYe requested review of this revision.Nov 2 2022, 3:51 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2022, 3:51 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

RKSimon added a subscriber: RKSimon.Nov 2 2022, 4:04 AM

Harbormaster completed remote builds in B195670: Diff 472566.Nov 2 2022, 5:06 AM

WIP update.

Harbormaster completed remote builds in B196846: Diff 474180.Nov 9 2022, 1:40 AM

Complete the imple and split tests into another Phab.

Herald added subscribers: kosarev, • pcwang-thead, frasercrmck and 25 others. · View Herald TranscriptNov 17 2022, 11:44 PM

FreddyYe edited the summary of this revision. (Show Details)Nov 17 2022, 11:44 PM

FreddyYe retitled this revision from [WIP] Add ExpandLargeFpConvert Pass to [WIP][X86] Add ExpandLargeFpConvert Pass and enable for X86.Nov 17 2022, 11:56 PM

FreddyYe edited the summary of this revision. (Show Details)

FreddyYe added a reviewer: pengfei.

FreddyYe edited the summary of this revision. (Show Details)

FreddyYe edited the summary of this revision. (Show Details)Nov 17 2022, 11:59 PM

FreddyYe edited the summary of this revision. (Show Details)

rebase

Harbormaster completed remote builds in B198389: Diff 476365.Nov 18 2022, 1:05 AM

FreddyYe edited the summary of this revision. (Show Details)Nov 18 2022, 2:01 AM

Support unsigned expandIToFP() and add IR comments as refer.

FreddyYe retitled this revision from [WIP][X86] Add ExpandLargeFpConvert Pass and enable for X86 to [X86] Add ExpandLargeFpConvert Pass and enable for X86.Nov 21 2022, 11:49 PM

FreddyYe edited the summary of this revision. (Show Details)

FreddyYe added reviewers: LuoYuanke, skan, RKSimon, mgehre-amd, aaron.ballman.

FreddyYe edited the summary of this revision. (Show Details)

Great that you worked on this!
I will have a deeper look at the end of this week.

Harbormaster completed remote builds in B198910: Diff 477076.Nov 22 2022, 4:19 AM

LuoYuanke added inline comments.Nov 26 2022, 12:34 AM

llvm/test/CodeGen/X86/expand-large-fp-convert-fptoui129.ll
10 ↗	(On Diff #477076)	fptoui?

LuoYuanke added inline comments.Nov 27 2022, 11:20 PM

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
76	It seems we just truncate the value. Is it because it is rounded to zero?
103	Maybe more readable with `FloatVal->getType()->isHalfTy()`.
107	Add comments "FPToSI"?
119	PowerOf2Ceil(FPMantissaWidth)?
121	The first letter should be capital for variable name.
123	Ditto.
llvm/test/CodeGen/AArch64/O0-pipeline.ll
19 ↗	(On Diff #477076)	Not sure if it is better to merge convert and div/rem into one pass.

Address comments. THX for review!

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
76	Yes, this is default behavior that codes like `(int)3.14` outputs 3 no matter what current rounding mode is. LLVM langref also defines so for fptosi: https://llvm.org/docs/LangRef.html#id269
103	Good idea.

mgehre-amd added inline comments.Nov 29 2022, 5:22 AM

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
91	The return value is unused at its call site - convert to void?
234	nit: word missing after "implicit" And can we have an assert for that assumption?
314	return void

I'm scared about missing some subtle issue in the implementation, but I think the test cases on the other PR should cover that.

This revision is now accepted and ready to land.Nov 29 2022, 5:29 AM

Harbormaster completed remote builds in B199981: Diff 478518.Nov 29 2022, 5:36 AM

LuoYuanke added inline comments.Nov 29 2022, 5:43 AM

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
146	FloatVal->getType()->isX86_FP80Ty()?
151	Do we assume the _BitInt size is larger than 128-bit?
152	ToBoolNot?
159	ExponentWithBias.
165	`1u << (ExponentWidth - 1)) - 1` expression is used several times. Maybe to set `Bias = 1u << (ExponentWidth - 1)) - 1` to be more readable.
180	NegOne
182	NegInf
185	PosInf
219	Retval0

LuoYuanke added inline comments.Nov 29 2022, 5:47 AM

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
152	Maybe rename it as PosOrNeg?

Address comments. THX for review!

FreddyYe added inline comments.Nov 29 2022, 10:57 PM

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
151	Yes for x86. 609: if (IntTy->getIntegerBitWidth() <= MaxLegalFpConvertBitWidth) 610: continue;
234	Yes, that is meaningful for debug with `-mllvm expand-fp-convert-bits`. I'll add. Generally, since integer width <= i128 are always lowered to libcall, the smallest integer type entering this pass is i129, whose width is larger than fp128.

Harbormaster completed remote builds in B200175: Diff 478809.Nov 29 2022, 11:39 PM

LuoYuanke added inline comments.Nov 30 2022, 5:11 AM

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
312	Add comments to indicate that fp80 is extended to fp128?
317	IsSigned
325	Move to line 491 to avoid dead instruction when it is not fp80?
327	Ditto
329	Ditto
390	It seems there is many coding style like this. Is it possible that create dead code? However I think they will eventually be deleted in ISel.

Address comments. THX for review!

FreddyYe added inline comments.Nov 30 2022, 5:47 PM

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
390	There may be some dead codes I don't know here. Though ISel will delete the dead nodes, I will try to delete such codes. I removed two in the next version. You can comment out if your find more. For `Shl` here, I think it's not dead code since it is always be used by `AAddr0`.

LGTM.

Harbormaster completed remote builds in B200398: Diff 479127.Nov 30 2022, 7:07 PM

FreddyYe edited the summary of this revision. (Show Details)Nov 30 2022, 9:01 PM

FreddyYe edited the summary of this revision. (Show Details)Nov 30 2022, 9:33 PM

This revision was landed with ongoing or failed builds.Nov 30 2022, 9:48 PM

Closed by commit rG89f36dd8f32f: [X86] Add ExpandLargeFpConvert Pass and enable for X86 (authored by FreddyYe). · Explain Why

This revision was automatically updated to reflect the committed changes.

FreddyYe added a commit: rG89f36dd8f32f: [X86] Add ExpandLargeFpConvert Pass and enable for X86.

I think this was the last reason for restricting _BitInt to <= 128 by default in clang. Are you planning to create a PR to lift that restriction now?

In D137241#3962787, @mgehre-amd wrote:

I think this was the last reason for restricting _BitInt to <= 128 by default in clang. Are you planning to create a PR to lift that restriction now?

Yes, I agree. The patch of tests also relies on lifting first. But I'm not sure if https://github.com/llvm/llvm-project/blob/450de8008bb0ccb5dfc9dd69b6f5b434158772bd/clang/include/clang/Basic/TargetInfo.h#L637 is the only place needs to change. @aaron.ballman WDYT?

In D137241#3963326, @FreddyYe wrote:

In D137241#3962787, @mgehre-amd wrote:

I think this was the last reason for restricting _BitInt to <= 128 by default in clang. Are you planning to create a PR to lift that restriction now?

Yes, I agree. The patch of tests also relies on lifting first. But I'm not sure if https://github.com/llvm/llvm-project/blob/450de8008bb0ccb5dfc9dd69b6f5b434158772bd/clang/include/clang/Basic/TargetInfo.h#L637 is the only place needs to change. @aaron.ballman WDYT?

Do *all* targets support > 128 now, or just x86 targets? If all targets support > 128, then that's the place to update (and we can consider starting to rip some of the target-specific machinery and command line option out). But if it's just x86, then we should override this function in the correct derived TargetInfo class.

In D137241#3963462, @aaron.ballman wrote:

In D137241#3963326, @FreddyYe wrote:

In D137241#3962787, @mgehre-amd wrote:

I think this was the last reason for restricting _BitInt to <= 128 by default in clang. Are you planning to create a PR to lift that restriction now?

Yes, I agree. The patch of tests also relies on lifting first. But I'm not sure if https://github.com/llvm/llvm-project/blob/450de8008bb0ccb5dfc9dd69b6f5b434158772bd/clang/include/clang/Basic/TargetInfo.h#L637 is the only place needs to change. @aaron.ballman WDYT?

Do *all* targets support > 128 now, or just x86 targets? If all targets support > 128, then that's the place to update (and we can consider starting to rip some of the target-specific machinery and command line option out). But if it's just x86, then we should override this function in the correct derived TargetInfo class.

I see. Thanks! I created https://reviews.llvm.org/D139170, I'll add both of you to review list. Please help review.

FreddyYe mentioned this in D139170: [X86][clang] Lift _BitInt() supported max width..Dec 2 2022, 12:20 AM

Revision Contents

Path

Size

crach.c

17 lines

fix.c

29 lines

fixsfsi_single_source.c

71 lines

fixsfsi_single_source_readable.ll

58 lines

float.c

6 lines

floatdidf_single_source.c

75 lines

floatdidf_single_source_readable.ll

47 lines

llvm/

include/

llvm/

CodeGen/

MachinePassRegistry.def

1 line

Passes.h

3 lines

InitializePasses.h

1 line

lib/

CodeGen/

CMakeLists.txt

1 line

CodeGen.cpp

1 line

ExpandLargeFpConvert.cpp

335 lines

TargetPassConfig.cpp

1 line

test/

Transforms/

ExpandLargeFpConvert/

fptosi129.ll

4 lines

si129tofp.ll

4 lines

tools/

opt/

opt.cpp

4 lines

Diff 472566

crach.c

This file was added.

				#include <stdio.h>
				static __inline long long ToInt0(_BitInt(256) x) {
				const union {
				long long f[4];
				_BitInt(256) i;
				} rep = {.i = x};
				return rep.f[0];
				}

				int main() {
				float b =3.14;
				_BitInt(256) a = b;
				printf("(int)3.14 = %lld\n", ToInt0(a));
				_BitInt(256) c = 1078523331;
				float d = c;
				printf("(float) 1078523331 = %f\n", d);
				}

fix.c

This file was added.

				// clang -Xclang -fexperimental-max-bitint-width=256 fix.c
				// Changes into _BitInt(129) also works.
				#include <stdio.h>
				// static __inline float fromRep(unsigned int x) {
				// const union {
				// float f;
				// int i;
				// } rep = {.i = x};
				// return rep.f;
				// }

				// typedef union {
				// _BitInt(129) x;
				// unsigned int y[5];
				// } bitint;
				// bitint a;
				// a.x = b;
				// printf("%d\n", a.y[0]);
				// printf("%d\n", a.y[1]);
				// printf("%d\n", a.y[2]);
				// printf("%d\n", a.y[3]);
				// printf("%d\n", a.y[4]);

				int main() {
				float b = 3433.14123f;
				_BitInt(32) c = b;
				int* d = (int*)&c;
				printf("%d\n", d[0]);
				}

fixsfsi_single_source.c

This file was added.

				//===-- fixsfsi.c - Implement __fixsfsi -----------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				#include <stdio.h>
				#include <fenv.h>

				#define significandBits 23
				#define signBit 0x80000000 // (REP_C(1) << (significandBits + exponentBits))
				#define exponentBias 127 //(maxExponent >> 1)
				#define implicitBit 0x800000 // (REP_C(1) << significandBits)

				#define absMask 0x7FFFFFFF //(signBit - 1U)
				#define significandMask (implicitBit - 1U)

				static __inline unsigned int toRep(float x) {
				const union {
				float f;
				unsigned int i;
				} rep = {.f = x};
				return rep.i;
				}

				static __inline float fromRep(unsigned int x) {
				const union {
				float f;
				int i;
				} rep = {.i = x};
				return rep.f;
				}

				int foo(float a) {
				const int fixint_max = (int)((~(unsigned int)0) / 2);
				const int fixint_min = -fixint_max - 1;
				// Break a into sign, exponent, significand parts.
				const unsigned int aRep = toRep(a);
				const unsigned int aAbs = aRep & absMask;
				const int sign = aRep & signBit ? -1 : 1;
				const int exponent = (aAbs >> significandBits) - exponentBias;
				const unsigned int significand = (aAbs & significandMask) \| implicitBit;

				// If exponent is negative, the result is zero.
				if (exponent < 0)
				return 0;

				// If the value is too large for the integer type, saturate.
				if ((unsigned)exponent >= sizeof(int) * 8)
				return sign == 1 ? fixint_max : fixifixsfsi_single_source_readablent_min;

				// If 0 <= exponent < significandBits, right shift to get the result.
				// Otherwise, shift left.
				if (exponent < significandBits)
				return sign * (significand >> (significandBits - exponent));
				else
				return sign * ((int)significand << (exponent - significandBits));
				}

				int main() {
				// fesetround(FE_UPWARD);
				// fesetround(FE_TONEAREST);
				// fesetround(FE_TOWARDZERO);
				// fesetround(FE_UPWARD);
				float a = fromRep(0x44ff7334);
				// printf("intput a float:\n");
				// scanf("%f", &a);
				int b = foo(a);
				printf("conversion result = %d\n", b);
				}
				No newline at end of file

fixsfsi_single_source_readable.ll

This file was added.

				; ModuleID = 'fixsfsi_single_source.c'
				source_filename = "fixsfsi_single_source.c"
				target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: mustprogress nofree norecurse nosync nounwind readnone willreturn uwtable
				define dso_local i32 @foo(float noundef %a) local_unnamed_addr #0 {
				entry:
				%aRep = bitcast float %a to i32
				%tobool.not = icmp sgt i32 %aRep, -1
				%sign = select i1 %tobool.not, i32 1, i32 -1
				%and = lshr i32 %aRep, 23
				%exponent_with_bias = and i32 %and, 255
				%aAbs = and i32 %aRep, 8388607
				%significand = or i32 %aAbs, 8388608
				%cmp = icmp ult i32 %exponent_with_bias, 127
				br i1 %cmp, label %cleanup, label %if.end

				if.end: ; preds = %entry
				%add1 = add nsw i32 %exponent_with_bias, -159
				%cmp3 = icmp ult i32 %add1, -32
				br i1 %cmp3, label %if.then5, label %if.end9

				if.then5: ; preds = %if.end
				%cond8 = select i1 %tobool.not, i32 2147483647, i32 -2147483648
				br label %cleanup

				if.end9: ; preds = %if.end
				%cmp10 = icmp ult i32 %exponent_with_bias, 150
				br i1 %cmp10, label %if.then12, label %if.else

				if.then12: ; preds = %if.end9
				%sub13 = sub nuw nsw i32 150, %exponent_with_bias
				%shr14 = lshr i32 %significand, %sub13
				%mul = mul nsw i32 %shr14, %sign
				br label %cleanup

				if.else: ; preds = %if.end9
				%sub15 = add nsw i32 %exponent_with_bias, -150
				%shl = shl nuw i32 %significand, %sub15
				%mul16 = mul nsw i32 %shl, %sign
				br label %cleanup

				cleanup: ; preds = %entry, %if.else, %if.then12, %if.then5
				%retval.0 = phi i32 [ %cond8, %if.then5 ], [ %mul, %if.then12 ], [ %mul16, %if.else ], [ 0, %entry ]
				ret i32 %retval.0
				}

				attributes #0 = { mustprogress nofree norecurse nosync nounwind readnone willreturn uwtable "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

				!llvm.module.flags = !{!0, !1, !2, !3}
				!llvm.ident = !{!4}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 8, !"PIC Level", i32 2}
				!2 = !{i32 7, !"PIE Level", i32 2}
				!3 = !{i32 7, !"uwtable", i32 2}
				!4 = !{!"clang version 16.0.0 (https://github.com/llvm/llvm-project.git 6c1dd0b7329f6ed2e402468c74355f09447052e3)"}

float.c

This file was added.

				#include <stdio.h>
				int main() {
				_BitInt(64) a = 123412312;
				double b = a;
				printf("%lf\n", b);
				}

floatdidf_single_source.c

This file was added.

				//===-- lib/floatsisf.c - integer -> single-precision conversion --- C --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements integer to single-precision conversion for the
				// compiler-rt library in the IEEE-754 default round-to-nearest, ties-to-even
				// mode.
				//
				//===----------------------------------------------------------------------===//

				#include <stdio.h>
				#include <fenv.h>

				#define significandBits 52
				#define typeWidth 64 //(sizeof(unsigned long long) * __CHAR_BIT__)
				#define exponentBits 11 // (typeWidth - significandBits - 1)

				#define exponentBias 1023 //(maxExponent >> 1)

				#define implicitBit 0x10000000000000 // (REP_C(1) << significandBits)

				#define signBit 0x8000000000000000 // (REP_C(1) << (significandBits + exponentBits))

				static __inline double fromRep(unsigned long long x) {
				const union {
				double f;
				long long i;
				} rep = {.i = x};
				return rep.f;
				}

				double foo(long long a) {
				const long long aWidth = sizeof a * __CHAR_BIT__;
				// Handle zero as a special case to protect clz
				if (a == 0)
				return fromRep(0);
				// All other cases begin by extracting the sign and absolute value of a
				unsigned long long sign = 0;
				if (a < 0) {
				sign = signBit;
				a = -a;
				}
				// Exponent of (fp_t)a is the width of abs(a).
				const long long exponent = (aWidth - 1) - __builtin_clzll(a);
				unsigned long long result;
				// Shift a into the significand field, rounding if it is a right-shift
				if (exponent <= significandBits) {
				const long long shift = significandBits - exponent;
				result = (unsigned long long)a << shift ^ implicitBit;
				} else {
				const long long shift = exponent - significandBits;
				result = (unsigned long long)a >> shift ^ implicitBit;
				unsigned long long round = (unsigned long long)a << (typeWidth - shift);
				if (round > signBit)
				result++;
				if (round == signBit)
				result += result & 1;
				}
				// Insert the exponent
				result += (unsigned long long)(exponent + exponentBias) << significandBits;
				// Insert the sign bit and return
				return fromRep(result \| sign);
				}

				int main() {
				long long a;
				printf("intput a integer:\n");
				scanf("%lld", &a);
				double b = foo(a);
				printf("conversion result = %f\n", b);
				}
				No newline at end of file

floatdidf_single_source_readable.ll

This file was added.

				define dso_local double @foo(i64 noundef %a) local_unnamed_addr #0 {
				entry:
				%cmp = icmp eq i64 %a, 0
				br i1 %cmp, label %cleanup, label %if.end

				if.end: ; preds = %entry
				%0 = and i64 %a, -9223372036854775808
				%1 = tail call i64 @llvm.abs.i64(i64 %a, i1 true)
				%2 = tail call i64 @llvm.ctlz.i64(i64 %1, i1 true), !range !5
				%sub4 = xor i64 %2, 63
				%cmp5 = icmp ult i64 %sub4, 53
				br i1 %cmp5, label %if.then7, label %if.else

				if.then7: ; preds = %if.end
				%sub8 = sub nuw nsw i64 52, %sub4
				%shl = shl i64 %1, %sub8
				%xor = xor i64 %shl, 4503599627370496
				br label %if.end22

				if.else: ; preds = %if.end
				%sub10 = sub nsw i64 11, %2
				%shr = lshr i64 %1, %sub10
				%xor11 = xor i64 %shr, 4503599627370496
				%sub12 = add nuw nsw i64 %2, 53
				%shl13 = shl i64 %1, %sub12
				%cmp14 = icmp ugt i64 %shl13, -9223372036854775808
				%inc = zext i1 %cmp14 to i64
				%spec.select43 = add nuw i64 %xor11, %inc
				%cmp18 = icmp eq i64 %shl13, -9223372036854775808
				%and = and i64 %spec.select43, 1
				%add = select i1 %cmp18, i64 %and, i64 0
				%result.1 = add nuw i64 %add, %spec.select43
				br label %if.end22

				if.end22: ; preds = %if.else, %if.then7
				%result.2 = phi i64 [ %xor, %if.then7 ], [ %result.1, %if.else ]
				%3 = shl nuw nsw i64 %2, 52
				%shl24 = sub nuw nsw i64 4890909195324358656, %3
				%add25 = add i64 %shl24, %result.2
				%or = or i64 %add25, %0
				%4 = bitcast i64 %or to double
				br label %cleanup

				cleanup: ; preds = %entry, %if.end22
				%retval.0 = phi double [ %4, %if.end22 ], [ 0.000000e+00, %entry ]
				ret double %retval.0
				}

llvm/include/llvm/CodeGen/MachinePassRegistry.def

	Show All 38 Lines
	FUNCTION_PASS("lower-constant-intrinsics", LowerConstantIntrinsicsPass, ())			FUNCTION_PASS("lower-constant-intrinsics", LowerConstantIntrinsicsPass, ())
	FUNCTION_PASS("unreachableblockelim", UnreachableBlockElimPass, ())			FUNCTION_PASS("unreachableblockelim", UnreachableBlockElimPass, ())
	FUNCTION_PASS("consthoist", ConstantHoistingPass, ())			FUNCTION_PASS("consthoist", ConstantHoistingPass, ())
	FUNCTION_PASS("replace-with-veclib", ReplaceWithVeclib, ())			FUNCTION_PASS("replace-with-veclib", ReplaceWithVeclib, ())
	FUNCTION_PASS("partially-inline-libcalls", PartiallyInlineLibCallsPass, ())			FUNCTION_PASS("partially-inline-libcalls", PartiallyInlineLibCallsPass, ())
	FUNCTION_PASS("ee-instrument", EntryExitInstrumenterPass, (false))			FUNCTION_PASS("ee-instrument", EntryExitInstrumenterPass, (false))
	FUNCTION_PASS("post-inline-ee-instrument", EntryExitInstrumenterPass, (true))			FUNCTION_PASS("post-inline-ee-instrument", EntryExitInstrumenterPass, (true))
	FUNCTION_PASS("expand-large-div-rem", ExpandLargeDivRemPass, ())			FUNCTION_PASS("expand-large-div-rem", ExpandLargeDivRemPass, ())
				FUNCTION_PASS("expand-large-fp-convert", ExpandLargeFpConvertPass, ())
	FUNCTION_PASS("expand-reductions", ExpandReductionsPass, ())			FUNCTION_PASS("expand-reductions", ExpandReductionsPass, ())
	FUNCTION_PASS("expandvp", ExpandVectorPredicationPass, ())			FUNCTION_PASS("expandvp", ExpandVectorPredicationPass, ())
	FUNCTION_PASS("lowerinvoke", LowerInvokePass, ())			FUNCTION_PASS("lowerinvoke", LowerInvokePass, ())
	FUNCTION_PASS("scalarize-masked-mem-intrin", ScalarizeMaskedMemIntrinPass, ())			FUNCTION_PASS("scalarize-masked-mem-intrin", ScalarizeMaskedMemIntrinPass, ())
	FUNCTION_PASS("tlshoist", TLSVariableHoistPass, ())			FUNCTION_PASS("tlshoist", TLSVariableHoistPass, ())
	FUNCTION_PASS("verify", VerifierPass, ())			FUNCTION_PASS("verify", VerifierPass, ())
	#undef FUNCTION_PASS			#undef FUNCTION_PASS

	▲ Show 20 Lines • Show All 151 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 484 Lines • ▼ Show 20 Lines	namespace llvm {
/// This pass expands the vector predication intrinsics into unpredicated		/// This pass expands the vector predication intrinsics into unpredicated
/// instructions with selects or just the explicit vector length into the		/// instructions with selects or just the explicit vector length into the
/// predicate mask.		/// predicate mask.
FunctionPass *createExpandVectorPredicationPass();		FunctionPass *createExpandVectorPredicationPass();

// Expands large div/rem instructions.		// Expands large div/rem instructions.
FunctionPass *createExpandLargeDivRemPass();		FunctionPass *createExpandLargeDivRemPass();

		// Expands large div/rem instructions.
		FunctionPass *createExpandLargeFpConvertPass();

// This pass expands memcmp() to load/stores.		// This pass expands memcmp() to load/stores.
FunctionPass *createExpandMemCmpPass();		FunctionPass *createExpandMemCmpPass();

/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp		/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp
FunctionPass *createBreakFalseDeps();		FunctionPass *createBreakFalseDeps();

// This pass expands indirectbr instructions.		// This pass expands indirectbr instructions.
FunctionPass *createIndirectBrExpandPass();		FunctionPass *createIndirectBrExpandPass();
▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
	void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry&);			void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry&);
	void initializeEarlyIfConverterPass(PassRegistry&);			void initializeEarlyIfConverterPass(PassRegistry&);
	void initializeEarlyIfPredicatorPass(PassRegistry &);			void initializeEarlyIfPredicatorPass(PassRegistry &);
	void initializeEarlyMachineLICMPass(PassRegistry&);			void initializeEarlyMachineLICMPass(PassRegistry&);
	void initializeEarlyTailDuplicatePass(PassRegistry&);			void initializeEarlyTailDuplicatePass(PassRegistry&);
	void initializeEdgeBundlesPass(PassRegistry&);			void initializeEdgeBundlesPass(PassRegistry&);
	void initializeEHContGuardCatchretPass(PassRegistry &);			void initializeEHContGuardCatchretPass(PassRegistry &);
	void initializeEliminateAvailableExternallyLegacyPassPass(PassRegistry&);			void initializeEliminateAvailableExternallyLegacyPassPass(PassRegistry&);
				void initializeExpandLargeFpConvertLegacyPassPass(PassRegistry&);
	void initializeExpandLargeDivRemLegacyPassPass(PassRegistry&);			void initializeExpandLargeDivRemLegacyPassPass(PassRegistry&);
	void initializeExpandMemCmpPassPass(PassRegistry&);			void initializeExpandMemCmpPassPass(PassRegistry&);
	void initializeExpandPostRAPass(PassRegistry&);			void initializeExpandPostRAPass(PassRegistry&);
	void initializeExpandReductionsPass(PassRegistry&);			void initializeExpandReductionsPass(PassRegistry&);
	void initializeExpandVectorPredicationPass(PassRegistry &);			void initializeExpandVectorPredicationPass(PassRegistry &);
	void initializeMakeGuardsExplicitLegacyPassPass(PassRegistry&);			void initializeMakeGuardsExplicitLegacyPassPass(PassRegistry&);
	void initializeExternalAAWrapperPassPass(PassRegistry&);			void initializeExternalAAWrapperPassPass(PassRegistry&);
	void initializeFEntryInserterPass(PassRegistry&);			void initializeFEntryInserterPass(PassRegistry&);
	▲ Show 20 Lines • Show All 288 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMCodeGen
DetectDeadLanes.cpp		DetectDeadLanes.cpp
DFAPacketizer.cpp		DFAPacketizer.cpp
DwarfEHPrepare.cpp		DwarfEHPrepare.cpp
EarlyIfConversion.cpp		EarlyIfConversion.cpp
EdgeBundles.cpp		EdgeBundles.cpp
EHContGuardCatchret.cpp		EHContGuardCatchret.cpp
ExecutionDomainFix.cpp		ExecutionDomainFix.cpp
ExpandLargeDivRem.cpp		ExpandLargeDivRem.cpp
		ExpandLargeFpConvert.cpp
ExpandMemCmp.cpp		ExpandMemCmp.cpp
ExpandPostRAPseudos.cpp		ExpandPostRAPseudos.cpp
ExpandReductions.cpp		ExpandReductions.cpp
ExpandVectorPredication.cpp		ExpandVectorPredication.cpp
FaultMaps.cpp		FaultMaps.cpp
FEntryInserter.cpp		FEntryInserter.cpp
FinalizeISel.cpp		FinalizeISel.cpp
FixupStatepointCallerSaved.cpp		FixupStatepointCallerSaved.cpp
▲ Show 20 Lines • Show All 206 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CodeGen.cpp

Show All 31 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeDebugifyMachineModulePass(Registry);		initializeDebugifyMachineModulePass(Registry);
initializeDetectDeadLanesPass(Registry);		initializeDetectDeadLanesPass(Registry);
initializeDwarfEHPrepareLegacyPassPass(Registry);		initializeDwarfEHPrepareLegacyPassPass(Registry);
initializeEarlyIfConverterPass(Registry);		initializeEarlyIfConverterPass(Registry);
initializeEarlyIfPredicatorPass(Registry);		initializeEarlyIfPredicatorPass(Registry);
initializeEarlyMachineLICMPass(Registry);		initializeEarlyMachineLICMPass(Registry);
initializeEarlyTailDuplicatePass(Registry);		initializeEarlyTailDuplicatePass(Registry);
initializeExpandLargeDivRemLegacyPassPass(Registry);		initializeExpandLargeDivRemLegacyPassPass(Registry);
		initializeExpandLargeFpConvertLegacyPassPass(Registry);
initializeExpandMemCmpPassPass(Registry);		initializeExpandMemCmpPassPass(Registry);
initializeExpandPostRAPass(Registry);		initializeExpandPostRAPass(Registry);
initializeFEntryInserterPass(Registry);		initializeFEntryInserterPass(Registry);
initializeFinalizeISelPass(Registry);		initializeFinalizeISelPass(Registry);
initializeFinalizeMachineBundlesPass(Registry);		initializeFinalizeMachineBundlesPass(Registry);
initializeFixupStatepointCallerSavedPass(Registry);		initializeFixupStatepointCallerSavedPass(Registry);
initializeFuncletLayoutPass(Registry);		initializeFuncletLayoutPass(Registry);
initializeGCMachineCodeAnalysisPass(Registry);		initializeGCMachineCodeAnalysisPass(Registry);
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp

This file was added.

				//===--- ExpandLargeFpConvert.cpp - Expand large fp convert----------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//

				// This pass expands ‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’,
				// ‘sitofp .. to’ instructions with a bitwidth above a threshold into a call to
				// auto-generated functions. This is useful for targets like x86_64 that cannot
				// lower fp convertions with more than 128 bits.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/StringExtras.h"
				#include "llvm/Analysis/GlobalsModRef.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/CodeGen/TargetLowering.h"
				#include "llvm/CodeGen/TargetPassConfig.h"
				#include "llvm/CodeGen/TargetSubtargetInfo.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/PassManager.h"
				#include "llvm/InitializePasses.h"
				#include "llvm/Pass.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Target/TargetMachine.h"
				// #include "llvm/Transforms/Utils/IntegerDivision.h"

				using namespace llvm;

				static cl::opt<unsigned>
				ExpandFpConvertBits("expand-fp-convert-bits", cl::Hidden,
				cl::init(llvm::IntegerType::MAX_INT_BITS),
				cl::desc("fp convert instructions on integers with "
				"more than <N> bits are expanded."));

				static bool isConstantPowerOfTwo(llvm::Value *V, bool SignedOp) {
				auto *C = dyn_cast<ConstantInt>(V);
				if (!C)
				return false;

				APInt Val = C->getValue();
				if (SignedOp && Val.isNegative())
				Val = -Val;
				return Val.isPowerOf2();
				}

				static bool isSigned(unsigned int Opcode) {
				return Opcode == Instruction::FPToSI \|\| Opcode == Instruction::SIToFP;
				}

				/// Generate code to convert a fp number to integer, replacing FPToS(U)I with
				/// the generated code. This currently generates code similarly to compiler-rt's
				/// implementations, but future work includes generating more specialized code
				/// when more information about the operands are known.
				///
				/// Replace fp to integer with generated code.
				static bool expandFPToI(Instruction *FPToI) {
				IRBuilder<> Builder(FPToI);
				auto* FloatVal = FPToI->getOperand(0);
				IntegerType *IntTy = cast<IntegerType>(FPToI->getType());

				unsigned BitWidth = FPToI->getType()->getIntegerBitWidth();
				unsigned FPMantissaWidth = FloatVal->getType()->getFPMantissaWidth() - 1;
				unsigned FloatWidth = pow(2, int(log2(FPMantissaWidth)) + 1);
				unsigned ExponentWidth = FloatWidth - FPMantissaWidth - 1;
				unsigned signBit = 1u << FloatWidth;
				unsigned implicitBit = 1u << FPMantissaWidth;
				unsigned significandMask = implicitBit - 1;

				BasicBlock *IBB = Builder.GetInsertBlock();
				Function *F = IBB->getParent();
				LuoYuankeUnsubmitted Done Reply Inline Actions It seems we just truncate the value. Is it because it is rounded to zero? LuoYuanke: It seems we just truncate the value. Is it because it is rounded to zero?
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions Yes, this is default behavior that codes like `(int)3.14` outputs 3 no matter what current rounding mode is. LLVM langref also defines so for fptosi: https://llvm.org/docs/LangRef.html#id269 FreddyYe: Yes, this is default behavior that codes like `(int)3.14` outputs 3 no matter what current…

				BasicBlock *Entry = Builder.GetInsertBlock();
				Entry->setName(Twine(Entry->getName(), "_entry"));
				BasicBlock *End = IBB->splitBasicBlock(Builder.GetInsertPoint(),
				"cleanup");
				BasicBlock *IfEnd = BasicBlock::Create(Builder.getContext(),
				"if.end", F, End);
				BasicBlock *IfThen5 = BasicBlock::Create(Builder.getContext(),
				"if.then5", F, End);
				BasicBlock *IfEnd9 = BasicBlock::Create(Builder.getContext(),
				"if.end9", F, End);
				BasicBlock *IfThen12 = BasicBlock::Create(Builder.getContext(),
				"if.then12", F, End);
				BasicBlock *IfElse = BasicBlock::Create(Builder.getContext(),
				"if.else", F, End);
				mgehre-amdUnsubmitted Done Reply Inline Actions The return value is unused at its call site - convert to void? mgehre-amd: The return value is unused at its call site - convert to void?

				Entry->getTerminator()->eraseFromParent();

				//entry:
				Builder.SetInsertPoint(Entry);
				Value *aRep0 = Builder.CreateBitCast(FloatVal, Builder.getIntNTy(FloatWidth), "aRep0");
				Value *aRep = Builder.CreateZExt(aRep0, FPToI->getType(), "aRep");
				Value *tobool_not = Builder.CreateICmpSGT(aRep, ConstantInt::getSigned(IntTy, -1), "tobool.not");
				Value *sign = Builder.CreateSelect(tobool_not, ConstantInt::getSigned(IntTy, 1), ConstantInt::getSigned(IntTy, -1), "sign");
				Value *andf = Builder.CreateLShr(aRep, Builder.getIntN(BitWidth, FPMantissaWidth), "and");
				Value *exponent_with_bias = Builder.CreateAnd(andf, Builder.getIntN(BitWidth, (1u << ExponentWidth) - 1), "exponent_with_bias");
				Value *aAbs = Builder.CreateAnd(aRep, Builder.getIntN(BitWidth, significandMask), "aAbs");
				LuoYuankeUnsubmitted Done Reply Inline Actions Maybe more readable with `FloatVal->getType()->isHalfTy()`. LuoYuanke: Maybe more readable with `FloatVal->getType()->isHalfTy()`.
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions Good idea. FreddyYe: Good idea.
				Value *significand = Builder.CreateOr(aAbs, Builder.getIntN(BitWidth, implicitBit), "significand");
				Value *cmp = Builder.CreateICmpULT(exponent_with_bias, Builder.getIntN(BitWidth, (1u << (ExponentWidth - 1)) - 1), "cmp");
				Builder.CreateCondBr(cmp, End, IfEnd);

				LuoYuankeUnsubmitted Done Reply Inline Actions Add comments "FPToSI"? LuoYuanke: Add comments "FPToSI"?
				//if.end:
				Builder.SetInsertPoint(IfEnd);
				Value *add1 = Builder.CreateAdd(exponent_with_bias, ConstantInt::getSigned(IntTy, -int64_t((1u << (ExponentWidth - 1)) + FloatWidth - 1)), "add1");
				Value *cmp3 = Builder.CreateICmpULT(add1, ConstantInt::getSigned(IntTy, -int64_t(FloatWidth)), "cmp3");
				Builder.CreateCondBr(cmp3, IfThen5, IfEnd9);

				//if.then5:
				Builder.SetInsertPoint(IfThen5);
				Value *cond8 = Builder.CreateSelect(tobool_not, Builder.getIntN(BitWidth, (int64_t(1u << (FloatWidth - 1)) - 1)), ConstantInt::getSigned(IntTy, -int64_t(1u << (FloatWidth - 1))), "cond8");
				Builder.CreateBr(End);

				//if.end9:
				LuoYuankeUnsubmitted Done Reply Inline Actions PowerOf2Ceil(FPMantissaWidth)? LuoYuanke: PowerOf2Ceil(FPMantissaWidth)?
				Builder.SetInsertPoint(IfEnd9);
				Value *cmp10 = Builder.CreateICmpULT(exponent_with_bias, Builder.getIntN(BitWidth, (1u << (ExponentWidth - 1)) + FPMantissaWidth - 1), "cmp10");
				LuoYuankeUnsubmitted Done Reply Inline Actions The first letter should be capital for variable name. LuoYuanke: The first letter should be capital for variable name.
				Builder.CreateCondBr(cmp10, IfThen12, IfElse);

				LuoYuankeUnsubmitted Done Reply Inline Actions Ditto. LuoYuanke: Ditto.
				//if.then12:
				Builder.SetInsertPoint(IfThen12);
				Value *sub13 = Builder.CreateSub(Builder.getIntN(BitWidth, (1u << (ExponentWidth - 1)) + FPMantissaWidth - 1), exponent_with_bias, "sub13");
				Value *shr14 = Builder.CreateLShr(significand, sub13, "shr14");
				Value *mul = Builder.CreateMul(shr14, sign, "mul");
				Builder.CreateBr(End);

				//if.else:
				Builder.SetInsertPoint(IfElse);
				Value *sub15 = Builder.CreateAdd(exponent_with_bias, ConstantInt::getSigned(IntTy, -int64_t((1u << (ExponentWidth - 1)) + FPMantissaWidth - 1)), "sub15");
				Value *shl = Builder.CreateShl(significand, sub15, "shl");
				Value *mul16 = Builder.CreateMul(shl, sign, "mul16");
				Builder.CreateBr(End);

				//cleanup:
				Builder.SetInsertPoint(End, End->begin());
				PHINode *retval_0 = Builder.CreatePHI(FPToI->getType(), 4);

				retval_0->addIncoming(cond8, IfThen5);
				retval_0->addIncoming(mul, IfThen12);
				retval_0->addIncoming(mul16, IfElse);
				retval_0->addIncoming(Builder.getIntN(BitWidth, 0), Entry);

				LuoYuankeUnsubmitted Done Reply Inline Actions FloatVal->getType()->isX86_FP80Ty()? LuoYuanke: FloatVal->getType()->isX86_FP80Ty()?
				FPToI->replaceAllUsesWith(retval_0);
				FPToI->dropAllReferences();
				FPToI->eraseFromParent();
				return true;
				}
				LuoYuankeUnsubmitted Done Reply Inline Actions Do we assume the _BitInt size is larger than 128-bit? LuoYuanke: Do we assume the _BitInt size is larger than 128-bit?
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions Yes for x86. 609: if (IntTy->getIntegerBitWidth() <= MaxLegalFpConvertBitWidth) 610: continue; FreddyYe: Yes for x86. ``` 609: if (IntTy->getIntegerBitWidth() <= MaxLegalFpConvertBitWidth) 610…

				LuoYuankeUnsubmitted Done Reply Inline Actions ToBoolNot? LuoYuanke: ToBoolNot?
				LuoYuankeUnsubmitted Done Reply Inline Actions Maybe rename it as PosOrNeg? LuoYuanke: Maybe rename it as PosOrNeg?
				/// Generate code to convert a fp number to integer, replacing S(U)IToFP with
				/// the generated code. This currently generates code similarly to compiler-rt's
				/// implementations, but future work includes generating more specialized code
				/// when more information about the operands are known.
				///
				/// Replace integer to fp with generated code.
				static bool expandIToFP(Instruction* IToFP) {
				LuoYuankeUnsubmitted Done Reply Inline Actions ExponentWithBias. LuoYuanke: ExponentWithBias.
				IRBuilder<> Builder(IToFP);
				auto* IntVal = IToFP->getOperand(0);
				IntegerType *IntTy = cast<IntegerType>(IntVal->getType());
				unsigned BitWidth = IntVal->getType()->getIntegerBitWidth();
				unsigned FPMantissaWidth = IToFP->getType()->getFPMantissaWidth() - 1;
				unsigned FloatWidth = pow(2, int(log2(FPMantissaWidth)) + 1);
				LuoYuankeUnsubmitted Done Reply Inline Actions `1u << (ExponentWidth - 1)) - 1` expression is used several times. Maybe to set `Bias = 1u << (ExponentWidth - 1)) - 1` to be more readable. LuoYuanke: `1u << (ExponentWidth - 1)) - 1` expression is used several times. Maybe to set `Bias = 1u <<…

				BasicBlock *IBB = Builder.GetInsertBlock();
				Function *F = IBB->getParent();
				Function *CTLZ = Intrinsic::getDeclaration(F->getParent(), Intrinsic::ctlz,
				IntTy);
				Function *ABS = Intrinsic::getDeclaration(F->getParent(), Intrinsic::abs,
				IntTy);
				ConstantInt *True = Builder.getTrue();

				BasicBlock *Entry = Builder.GetInsertBlock();
				Entry->setName(Twine(Entry->getName(), "_entry"));
				BasicBlock *End = IBB->splitBasicBlock(Builder.GetInsertPoint(),
				"cleanup");
				BasicBlock *IfEnd = BasicBlock::Create(Builder.getContext(),
				"if.end", F, End);
				LuoYuankeUnsubmitted Done Reply Inline Actions NegOne LuoYuanke: NegOne
				BasicBlock *IfThen7 = BasicBlock::Create(Builder.getContext(),
				"if.then7", F, End);
				LuoYuankeUnsubmitted Done Reply Inline Actions NegInf LuoYuanke: NegInf
				BasicBlock *IfElse = BasicBlock::Create(Builder.getContext(),
				"if.else", F, End);
				BasicBlock *IfEnd22 = BasicBlock::Create(Builder.getContext(),
				LuoYuankeUnsubmitted Done Reply Inline Actions PosInf LuoYuanke: PosInf
				"if.end22", F, End);

				Entry->getTerminator()->eraseFromParent();

				//entry:
				Builder.SetInsertPoint(Entry);
				Value *cmp = Builder.CreateICmpEQ(IntVal, ConstantInt::getSigned(IntTy, 0), "cmp");
				Builder.CreateCondBr(cmp, End, IfEnd);

				//if.end:
				Builder.SetInsertPoint(IfEnd);
				Value *a0 = Builder.CreateAnd(IntVal, ConstantInt::getSigned(IntTy, 1ull << (FloatWidth -1)), "a0");
				Value *a1 = Builder.CreateCall(ABS, {IntVal, True}, "a1");
				Value *a2 = Builder.CreateCall(CTLZ, {a1, True}, "a2");
				Value *sub4 = Builder.CreateXor(a2, Builder.getIntN(BitWidth, FloatWidth - 1), "sub4");
				Value *cmp5 = Builder.CreateICmpULT(sub4, Builder.getIntN(BitWidth, FPMantissaWidth + 1), "cmp5");
				Builder.CreateCondBr(cmp5, IfThen7, IfElse);

				//if.then7:
				Builder.SetInsertPoint(IfThen7);
				Value *sub8 = Builder.CreateSub(Builder.getIntN(BitWidth, FPMantissaWidth), sub4, "sub8");
				Value *shl = Builder.CreateShl(a1, sub8, "shl");
				Value *xorf = Builder.CreateXor(shl, Builder.getIntN(BitWidth, 1ull << FPMantissaWidth), "xor");
				Builder.CreateBr(IfEnd22);

				//if.else:
				Builder.SetInsertPoint(IfElse);
				Value *sub10 = Builder.CreateSub(Builder.getIntN(BitWidth, FloatWidth - FPMantissaWidth - 1), a2, "sub10");
				Value *shr = Builder.CreateLShr(a1, sub10, "shr");
				Value *xor11 = Builder.CreateXor(shr, Builder.getIntN(BitWidth, 1ull << FPMantissaWidth), "xor11");
				Value *sub12 = Builder.CreateAdd(a2, Builder.getIntN(BitWidth, FPMantissaWidth + 1), "sub12");
				Value *shl13 = Builder.CreateShl(a1, sub12, "shl13");
				Value *cmp14 = Builder.CreateICmpUGT(shl13, ConstantInt::getSigned(IntTy, 1ull<< (FloatWidth -1)), "cmp14");
				Value *inc = Builder.CreateZExt(cmp14, IntTy, "inc");
				LuoYuankeUnsubmitted Done Reply Inline Actions Retval0 LuoYuanke: Retval0
				Value *spec_select43 = Builder.CreateAdd(xor11, inc, "spec.select43");
				Value *cmp18 = Builder.CreateICmpEQ(shl13, ConstantInt::getSigned(IntTy, 1ull << (FloatWidth -1)), "cmp18");
				Value *andf = Builder.CreateAnd(spec_select43, Builder.getIntN(BitWidth, 1), "and");
				Value *add = Builder.CreateSelect(cmp18, andf, Builder.getIntN(BitWidth, 0), "add");
				Value *result_1 = Builder.CreateAdd(add, spec_select43, "result.1");
				Builder.CreateBr(IfEnd22);

				//if.end22:
				Builder.SetInsertPoint(IfEnd22);
				PHINode *result_2 = Builder.CreatePHI(IntTy, 2, "result.2");
				result_2->addIncoming(xorf, IfThen7);
				result_2->addIncoming(result_1, IfElse);
				Value *a3 = Builder.CreateShl(a2, Builder.getIntN(BitWidth, FPMantissaWidth), "a3");
				Value *shl24 = Builder.CreateSub(Builder.getIntN(BitWidth, 4890909195324358656), a3, "shl24");
				Value *add25 = Builder.CreateAdd(shl24, result_2, "add25");
				mgehre-amdUnsubmitted Done Reply Inline Actions nit: word missing after "implicit" And can we have an assert for that assumption? mgehre-amd: nit: word missing after "implicit" And can we have an assert for that assumption?
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions Yes, that is meaningful for debug with `-mllvm expand-fp-convert-bits`. I'll add. Generally, since integer width <= i128 are always lowered to libcall, the smallest integer type entering this pass is i129, whose width is larger than fp128. FreddyYe: Yes, that is meaningful for debug with `-mllvm expand-fp-convert-bits`. I'll add. Generally…
				Value *orf = Builder.CreateOr(add25, a0, "or");
				Value *a4 = Builder.CreateBitCast(orf, IToFP->getType(), "a4");
				Builder.CreateBr(End);

				//cleanup:
				Builder.SetInsertPoint(End, End->begin());
				PHINode *retval_0 = Builder.CreatePHI(IToFP->getType(), 2, "retval.0");
				retval_0->addIncoming(a4, IfEnd22);
				retval_0->addIncoming(ConstantFP::getZero(IToFP->getType(), false), Entry);

				IToFP->replaceAllUsesWith(retval_0);
				IToFP->dropAllReferences();
				IToFP->eraseFromParent();
				return true;
				}

				static bool runImpl(Function &F, const TargetLowering &TLI) {
				SmallVector<Instruction*, 4> Replace;
				bool Modified = false;

				unsigned MaxLegalFpConvertBitWidth = TLI.getMaxDivRemBitWidthSupported();
				if (ExpandFpConvertBits != llvm::IntegerType::MAX_INT_BITS)
				MaxLegalFpConvertBitWidth = ExpandFpConvertBits;

				if (MaxLegalFpConvertBitWidth >= llvm::IntegerType::MAX_INT_BITS)
				return false;

				for (auto &I : instructions(F)) {
				switch (I.getOpcode()) {
				case Instruction::FPToUI:
				case Instruction::FPToSI: {
				// TODO: This doesn't handle vectors.
				auto *IntTy = dyn_cast<IntegerType>(I.getType());
				if (IntTy->getIntegerBitWidth() <= MaxLegalFpConvertBitWidth)
				continue;

				Replace.push_back(&I);
				Modified = true;
				break;
				}
				case Instruction::UIToFP:
				case Instruction::SIToFP: {
				auto *IntTy = dyn_cast<IntegerType>(I.getOperand(0)->getType());
				if (IntTy->getIntegerBitWidth() <= MaxLegalFpConvertBitWidth)
				continue;

				Replace.push_back(&I);
				Modified = true;
				break;
				}
				default:
				break;
				}
				}

				if (Replace.empty())
				return false;

				while (!Replace.empty()) {
				Instruction *I = Replace.pop_back_val();
				if (I->getOpcode() == Instruction::FPToUI \|\|
				I->getOpcode() == Instruction::FPToSI) {
				expandFPToI(I);
				} else {
				expandIToFP(I);
				}
				}

				return Modified;
				}

				class ExpandLargeFpConvertLegacyPass : public FunctionPass {
				public:
				static char ID;

				ExpandLargeFpConvertLegacyPass() : FunctionPass(ID) {
				initializeExpandLargeFpConvertLegacyPassPass(*PassRegistry::getPassRegistry());
				}
				LuoYuankeUnsubmitted Done Reply Inline Actions Add comments to indicate that fp80 is extended to fp128? LuoYuanke: Add comments to indicate that fp80 is extended to fp128?

				bool runOnFunction(Function &F) override {
				mgehre-amdUnsubmitted Done Reply Inline Actions return void mgehre-amd: return void
				auto *TM = &getAnalysis<TargetPassConfig>().getTM<TargetMachine>();
				auto *TLI = TM->getSubtargetImpl(F)->getTargetLowering();
				return runImpl(F, *TLI);
				LuoYuankeUnsubmitted Done Reply Inline Actions IsSigned LuoYuanke: IsSigned
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<TargetPassConfig>();
				AU.addPreserved<AAResultsWrapperPass>();
				AU.addPreserved<GlobalsAAWrapperPass>();
				}
				};
				LuoYuankeUnsubmitted Done Reply Inline Actions Move to line 491 to avoid dead instruction when it is not fp80? LuoYuanke: Move to line 491 to avoid dead instruction when it is not fp80?

				char ExpandLargeFpConvertLegacyPass::ID = 0;
				LuoYuankeUnsubmitted Done Reply Inline Actions Ditto LuoYuanke: Ditto
				INITIALIZE_PASS_BEGIN(ExpandLargeFpConvertLegacyPass, "expand-large-fp-convert",
				"Expand large fp convert", false, false)
				LuoYuankeUnsubmitted Done Reply Inline Actions Ditto LuoYuanke: Ditto
				INITIALIZE_PASS_END(ExpandLargeFpConvertLegacyPass, "expand-large-fp-convert",
				"Expand large fp convert", false, false)

				FunctionPass *llvm::createExpandLargeFpConvertPass() {
				return new ExpandLargeFpConvertLegacyPass();
				}
				LuoYuankeUnsubmitted Done Reply Inline Actions It seems there is many coding style like this. Is it possible that create dead code? However I think they will eventually be deleted in ISel. LuoYuanke: It seems there is many coding style like this. Is it possible that create dead code? However I…
				FreddyYeAuthorUnsubmitted Done Reply Inline Actions There may be some dead codes I don't know here. Though ISel will delete the dead nodes, I will try to delete such codes. I removed two in the next version. You can comment out if your find more. For `Shl` here, I think it's not dead code since it is always be used by `AAddr0`. FreddyYe: There may be some dead codes I don't know here. Though ISel will delete the dead nodes, I will…

llvm/lib/CodeGen/TargetPassConfig.cpp

	Show First 20 Lines • Show All 1,108 Lines • ▼ Show 20 Lines

	bool TargetPassConfig::addISelPasses() {			bool TargetPassConfig::addISelPasses() {
	if (TM->useEmulatedTLS())			if (TM->useEmulatedTLS())
	addPass(createLowerEmuTLSPass());			addPass(createLowerEmuTLSPass());

	addPass(createPreISelIntrinsicLoweringPass());			addPass(createPreISelIntrinsicLoweringPass());
	PM->add(createTargetTransformInfoWrapperPass(TM->getTargetIRAnalysis()));			PM->add(createTargetTransformInfoWrapperPass(TM->getTargetIRAnalysis()));
	addPass(createExpandLargeDivRemPass());			addPass(createExpandLargeDivRemPass());
				addPass(createExpandLargeFpConvertPass());
	addIRPasses();			addIRPasses();
	addCodeGenPrepare();			addCodeGenPrepare();
	addPassesToHandleExceptions();			addPassesToHandleExceptions();
	addISelPrepare();			addISelPrepare();

	return addCoreISelPasses();			return addCoreISelPasses();
	}			}

	▲ Show 20 Lines • Show All 454 Lines • Show Last 20 Lines

llvm/test/Transforms/ExpandLargeFpConvert/fptosi129.ll

This file was added.

				define i129 @foo(float %a) {
				%conv = fptosi float %a to i129
				ret i129 %conv
				}

llvm/test/Transforms/ExpandLargeFpConvert/si129tofp.ll

This file was added.

				define double @foo(i64 %a) {
				%conv = sitofp i64 %a to double
				ret double %conv
				}

llvm/tools/opt/opt.cpp

Show First 20 Lines • Show All 450 Lines • ▼ Show 20 Lines	std::vector<StringRef> PassNameExact = {
"generic-to-nvvm", "expandmemcmp",		"generic-to-nvvm", "expandmemcmp",
"loop-reduce", "lower-amx-type",		"loop-reduce", "lower-amx-type",
"pre-amx-config", "lower-amx-intrinsics",		"pre-amx-config", "lower-amx-intrinsics",
"polyhedral-info", "print-polyhedral-info",		"polyhedral-info", "print-polyhedral-info",
"replace-with-veclib", "jmc-instrument",		"replace-with-veclib", "jmc-instrument",
"dot-regions", "dot-regions-only",		"dot-regions", "dot-regions-only",
"view-regions", "view-regions-only",		"view-regions", "view-regions-only",
"select-optimize", "expand-large-div-rem",		"select-optimize", "expand-large-div-rem",
"structurizecfg", "fix-irreducible"};		"structurizecfg", "fix-irreducible",
		"expand-large-fp-convert"};
for (const auto &P : PassNamePrefix)		for (const auto &P : PassNamePrefix)
if (Pass.startswith(P))		if (Pass.startswith(P))
return true;		return true;
for (const auto &P : PassNameContain)		for (const auto &P : PassNameContain)
if (Pass.contains(P))		if (Pass.contains(P))
return true;		return true;
return llvm::is_contained(PassNameExact, Pass);		return llvm::is_contained(PassNameExact, Pass);
}		}
Show All 31 Lines	int main(int argc, char **argv) {
initializeAnalysis(Registry);		initializeAnalysis(Registry);
initializeTransformUtils(Registry);		initializeTransformUtils(Registry);
initializeInstCombine(Registry);		initializeInstCombine(Registry);
initializeAggressiveInstCombine(Registry);		initializeAggressiveInstCombine(Registry);
initializeTarget(Registry);		initializeTarget(Registry);
// For codegen passes, only passes that do IR to IR transformation are		// For codegen passes, only passes that do IR to IR transformation are
// supported.		// supported.
initializeExpandLargeDivRemLegacyPassPass(Registry);		initializeExpandLargeDivRemLegacyPassPass(Registry);
		initializeExpandLargeFpConvertLegacyPassPass(Registry);
initializeExpandMemCmpPassPass(Registry);		initializeExpandMemCmpPassPass(Registry);
initializeScalarizeMaskedMemIntrinLegacyPassPass(Registry);		initializeScalarizeMaskedMemIntrinLegacyPassPass(Registry);
initializeSelectOptimizePass(Registry);		initializeSelectOptimizePass(Registry);
initializeCodeGenPreparePass(Registry);		initializeCodeGenPreparePass(Registry);
initializeAtomicExpandPass(Registry);		initializeAtomicExpandPass(Registry);
initializeRewriteSymbolsLegacyPassPass(Registry);		initializeRewriteSymbolsLegacyPassPass(Registry);
initializeWinEHPreparePass(Registry);		initializeWinEHPreparePass(Registry);
initializeDwarfEHPrepareLegacyPassPass(Registry);		initializeDwarfEHPrepareLegacyPassPass(Registry);
▲ Show 20 Lines • Show All 519 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add ExpandLargeFpConvert Pass and enable for X86ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 472566

crach.c

fix.c

fixsfsi_single_source.c

fixsfsi_single_source_readable.ll

float.c

floatdidf_single_source.c

floatdidf_single_source_readable.ll

llvm/include/llvm/CodeGen/MachinePassRegistry.def

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/InitializePasses.h

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/CodeGen.cpp

llvm/lib/CodeGen/ExpandLargeFpConvert.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

llvm/test/Transforms/ExpandLargeFpConvert/fptosi129.ll

llvm/test/Transforms/ExpandLargeFpConvert/si129tofp.ll

llvm/tools/opt/opt.cpp

[X86] Add ExpandLargeFpConvert Pass and enable for X86
ClosedPublic