This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AMDGPU/
-
Target/
-
AMDGPU/
9/18
AMDGPUInstCombineIntrinsic.cpp
-
test/Transforms/InstCombine/AMDGPU/
-
Transforms/
-
InstCombine/
-
AMDGPU/
1/1
amdgcn-intrinsics.ll

Differential D120150

Constant folding of llvm.amdgcn.trig.preop
Needs ReviewPublic

Authored by arsenm on Feb 18 2022, 11:12 AM.

Download Raw Diff

Details

Reviewers

b-sumner
sameerds
cdevadas
Ravi

Summary

If the parameters(the input and segment select) coming in to amdgcn.trig.preop intrinsic are compile time constants, then this patch pre-computes the output of amdgcn.trig.preop on the CPU and replaces the uses with the computed constant.

All the existing AMDGPU lit cases pass along with the negative cases where the parameters to this intrinsic are variable.
Added a simple test case with the exact output that matches the output from the GPU.

Created a small HIP test application with the exact compute logic(and the constants used for 2/pi) running on the CPU and the intrinsic invoked for the GPU kernel.
Ran the test over the entire range of double floating-point. The outputs from the CPU and those from the intrinsic on gfx10 AMD GPU match.

Diff Detail

Event Timeline

Ravi created this revision.Feb 18 2022, 11:12 AM

Herald added subscribers: foad, kerbowa, hiraditya and 3 others. · View Herald TranscriptFeb 18 2022, 11:12 AM

Ravi requested review of this revision.Feb 18 2022, 11:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2022, 11:12 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

arsenm added inline comments.Feb 18 2022, 11:27 AM

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
330	Relying on size of double, also this is technically undefined.
1001	uint32_t
1017–1018	Early exit and reduce indentation
1020	Indentation off
1035	Should stick with apfloat operations
1039	You can use scalbn for APFloat instead of relying on host ldexp

Harbormaster completed remote builds in B150460: Diff 409976.Feb 18 2022, 11:35 AM

Ran the test over the entire range of double floating-point.

Just curious: how long does it take to test all 2^64 inputs?

In D120150#3332613, @foad wrote:

Ran the test over the entire range of double floating-point.

Just curious: how long does it take to test all 2^64 inputs?

The only dependence is on the exponent so 2K inputs is sufficient.

In D120150#3332613, @foad wrote:

Ran the test over the entire range of double floating-point.

Just curious: how long does it take to test all 2^64 inputs?

It took around 5 days on Ryzen 9 - 5950. Was running a single thread for CPU though. And each input was checked with all 32 segments.

In D120150#3332632, @b-sumner wrote:

In D120150#3332613, @foad wrote:

Ran the test over the entire range of double floating-point.

Just curious: how long does it take to test all 2^64 inputs?

The only dependence is on the exponent so 2K inputs is sufficient.

Yes right. That would have greatly reduced the run times. Missed out, didn't give a thought in that direction.

Ravi added inline comments.Feb 18 2022, 11:38 PM

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
1035	Yes, will do that.
1039	Yes..will try it out and check for any precision difference. Should be good as long as the internal implementation of APFloat is within 0.5 ULP.

arsenm added inline comments.Feb 21 2022, 6:51 AM

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
1039	Actually we should be constant folding the ldexp intrinsic too. I thought I did that before, but in the code above I don't see it handling arbitrary constants

Fixed all the review comments.

Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2022, 11:40 AM

Herald added subscribers: kosarev, jsilvanus, hsmhsm. · View Herald Transcript

Ravi marked 6 inline comments as done.May 11 2022, 11:42 AM

Ravi added inline comments.

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
1039	Can be done in another patch

Harbormaster completed remote builds in B163953: Diff 428730.May 11 2022, 1:15 PM

arsenm added inline comments.May 11 2022, 4:18 PM

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
1015	demorgan this
1026	extra parentheses
1033	extra parentheses
1045	extra parentheses
1046	extra parentheses

foad added inline comments.May 12 2022, 8:40 AM

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
1012	Capitalizing the names like CSrc, CSeg, FSrc, NumBits etc is more common.
1020	Just use `->getValue`. And in general I think you should convert CSrc and CSeg to plain uint32_t / uint64_t as soon as possible. You have lots of uses of APInt below where it really is not necessary.

Reverse ping

@Ravi where are these tests you mentioned?

arsenm added inline comments.Nov 16 2022, 1:45 PM

llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
5655	I feel like the tests are lacking in sample values but coming up with ones to test every point in the table seems exhausting

Harbormaster completed remote builds in B198064: Diff 475913.Nov 16 2022, 2:51 PM

Add assert, which shows this table lookup is broken.

Also improve poison handling

Harbormaster completed remote builds in B198289: Diff 476233.Nov 17 2022, 2:01 PM

foad added inline comments.Nov 18 2022, 6:47 AM

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
1045	This (and some of the calculation below) can just use `int`. No need for APInt.

Revision Contents

Path

Size

llvm/

lib/

Target/

AMDGPU/

AMDGPUInstCombineIntrinsic.cpp

88 lines

test/

Transforms/

InstCombine/

AMDGPU/

amdgcn-intrinsics.ll

48 lines

Diff 476233

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp

Show First 20 Lines • Show All 321 Lines • ▼ Show 20 Lines	return modifyIntrinsicCall(
// Convert the bias		// Convert the bias
if (!OnlyDerivatives && ImageDimIntr->NumBiasArgs != 0) {		if (!OnlyDerivatives && ImageDimIntr->NumBiasArgs != 0) {
Value *Bias = II.getOperand(ImageDimIntr->BiasIndex);		Value *Bias = II.getOperand(ImageDimIntr->BiasIndex);
Args[ImageDimIntr->BiasIndex] = convertTo16Bit(*Bias, IC.Builder);		Args[ImageDimIntr->BiasIndex] = convertTo16Bit(*Bias, IC.Builder);
}		}
});		});
}		}

bool GCNTTIImpl::canSimplifyLegacyMulToMul(const Value Op0, const Value Op1,		bool GCNTTIImpl::canSimplifyLegacyMulToMul(const Value Op0, const Value Op1,
		arsenmAuthorUnsubmitted Done Reply Inline Actions Relying on size of double, also this is technically undefined. arsenm: Relying on size of double, also this is technically undefined.
InstCombiner &IC) const {		InstCombiner &IC) const {
// The legacy behaviour is that multiplying +/-0.0 by anything, even NaN or		// The legacy behaviour is that multiplying +/-0.0 by anything, even NaN or
// infinity, gives +0.0. If we can prove we don't have one of the special		// infinity, gives +0.0. If we can prove we don't have one of the special
// cases then we can use a normal multiply instead.		// cases then we can use a normal multiply instead.
// TODO: Create and use isKnownFiniteNonZero instead of just matching		// TODO: Create and use isKnownFiniteNonZero instead of just matching
// constants here.		// constants here.
if (match(Op0, PatternMatch::m_FiniteNonZero()) \|\|		if (match(Op0, PatternMatch::m_FiniteNonZero()) \|\|
match(Op1, PatternMatch::m_FiniteNonZero())) {		match(Op1, PatternMatch::m_FiniteNonZero())) {
▲ Show 20 Lines • Show All 652 Lines • ▼ Show 20 Lines	case Intrinsic::amdgcn_ldexp: {
// ldexp(x, 0) -> x		// ldexp(x, 0) -> x
// ldexp(x, undef) -> x		// ldexp(x, undef) -> x
if (isa<UndefValue>(Op1) \|\| match(Op1, PatternMatch::m_ZeroInt())) {		if (isa<UndefValue>(Op1) \|\| match(Op1, PatternMatch::m_ZeroInt())) {
return IC.replaceInstUsesWith(II, Op0);		return IC.replaceInstUsesWith(II, Op0);
}		}

break;		break;
}		}
		case Intrinsic::amdgcn_trig_preop: {
		// The intrinsic is declared with name mangling, but currently the
		// instruction only exists for f64
		arsenmAuthorUnsubmitted Done Reply Inline Actions uint32_t arsenm: uint32_t
		if (!II.getType()->isDoubleTy())
		break;

		Value *Src = II.getArgOperand(0);
		Value *Segment = II.getArgOperand(1);
		if (isa<PoisonValue>(Src))
		return IC.replaceInstUsesWith(II, Src);

		if (isa<UndefValue>(Src)) {
		auto *QNaN = ConstantFP::get(
		II.getType(), APFloat::getQNaN(II.getType()->getFltSemantics()));
		foadUnsubmitted Not Done Reply Inline Actions Capitalizing the names like CSrc, CSeg, FSrc, NumBits etc is more common. foad: Capitalizing the names like CSrc, CSeg, FSrc, NumBits etc is more common.
		return IC.replaceInstUsesWith(II, QNaN);
		}

		arsenmAuthorUnsubmitted Not Done Reply Inline Actions demorgan this arsenm: demorgan this
		const ConstantFP *Csrc = dyn_cast<ConstantFP>(Src);
		if (!Csrc)
		break;
		arsenmAuthorUnsubmitted Done Reply Inline Actions Early exit and reduce indentation arsenm: Early exit and reduce indentation

		if (II.isStrictFP())
		arsenmAuthorUnsubmitted Done Reply Inline Actions Indentation off arsenm: Indentation off
		foadUnsubmitted Not Done Reply Inline Actions Just use `->getValue`. And in general I think you should convert CSrc and CSeg to plain uint32_t / uint64_t as soon as possible. You have lots of uses of APInt below where it really is not necessary. foad: Just use `->getValue`. And in general I think you should convert CSrc and CSeg to plain…
		break;

		const APFloat &Fsrc = Csrc->getValueAPF();
		if (Fsrc.isNaN()) {
		// FIXME: We just need to make the nan quiet here, but that's unavailable
		// on APFloat, only IEEEfloat
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions extra parentheses arsenm: extra parentheses
		auto *Quieted = ConstantFP::get(
		II.getType(), scalbn(Fsrc, 0, APFloat::rmNearestTiesToEven));
		return IC.replaceInstUsesWith(II, Quieted);
		}

		const ConstantInt *Cseg = dyn_cast<ConstantInt>(Segment);
		if (!Cseg)
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions extra parentheses arsenm: extra parentheses
		break;

		arsenmAuthorUnsubmitted Done Reply Inline Actions Should stick with apfloat operations arsenm: Should stick with apfloat operations
		RaviUnsubmitted Done Reply Inline Actions Yes, will do that. Ravi: Yes, will do that.
		static const uint32_t TwoByPi[] = {
		0xa2f9836e, 0x4e441529, 0xfc2757d1, 0xf534ddc0, 0xdb629599, 0x3c439041,
		0xfe5163ab, 0xdebbc561, 0xb7246e3a, 0x424dd2e0, 0x06492eea, 0x09d1921c,
		0xfe1deb1c, 0xb129a73e, 0xe88235f5, 0x2ebb4484, 0xe99c7026, 0xb45f7e41,
		arsenmAuthorUnsubmitted Done Reply Inline Actions You can use scalbn for APFloat instead of relying on host ldexp arsenm: You can use scalbn for APFloat instead of relying on host ldexp
		RaviUnsubmitted Done Reply Inline Actions Yes..will try it out and check for any precision difference. Should be good as long as the internal implementation of APFloat is within 0.5 ULP. Ravi: Yes..will try it out and check for any precision difference. Should be good as long as the…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions Actually we should be constant folding the ldexp intrinsic too. I thought I did that before, but in the code above I don't see it handling arbitrary constants arsenm: Actually we should be constant folding the ldexp intrinsic too. I thought I did that before…
		RaviUnsubmitted Done Reply Inline Actions Can be done in another patch Ravi: Can be done in another patch
		0x3991d639, 0x835339f4, 0x9c845f8b, 0xbdf9283b, 0x1ff897ff, 0xde05980f,
		0xef2f118b, 0x5a0a6d1f, 0x6d367ecf, 0x27cb09b7, 0x4f463f66, 0x9e5fea2d,
		0x7527bac7, 0xebe5f17b, 0x3d0739f7, 0x8a5292ea, 0x6bfb5fb1, 0x1f8d5d08,
		0x56033046};

		const APInt &SegVal = Cseg->getValue();
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions extra parentheses arsenm: extra parentheses
		foadUnsubmitted Not Done Reply Inline Actions This (and some of the calculation below) can just use `int`. No need for APInt. foad: This (and some of the calculation below) can just use `int`. No need for APInt.
		bool Ovflow;
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions extra parentheses arsenm: extra parentheses
		unsigned Numbits = 32;
		bool Signed = true;

		APInt EClamp(Numbits, 1077, Signed);
		APInt E = Fsrc.bitcastToAPInt().ashr(52);
		E &= 0x7ff;
		E = E.trunc(Numbits);
		APInt Shift =
		(E.sgt(EClamp) ? E.ssub_ov(EClamp, Ovflow) : APInt(Numbits, 0, Signed))
		.sadd_ov(APInt(Numbits, 53, Signed).smul_ov(SegVal, Ovflow),
		Ovflow);
		int32_t I = Shift.ashr(5).getSExtValue();

		assert(I >= 0 && static_cast<size_t>(I) < std::size(TwoByPi));

		APInt Bshift = Shift & 0x1f;
		Numbits = 64;
		Signed = false;
		APInt Thi = APInt(Numbits,
		(((uint64_t)TwoByPi[I] << 32) \| (uint64_t)TwoByPi[I + 1]),
		Signed);
		APInt Tlo = APInt(Numbits, ((uint64_t)TwoByPi[I + 2] << 32), Signed);

		if (Bshift.sgt(0)) {
		Numbits = 32;
		Signed = true;
		Thi = Thi.shl(Bshift) \|
		Tlo.lshr(APInt(Numbits, 64, Signed).ssub_ov(Bshift, Ovflow));
		}

		Thi = Thi.lshr(11);
		APFloat Res = APFloat(Thi.roundToDouble());
		int32_t Scale = -53 - Shift.getSExtValue();

		if (E.sge(0x7b0))
		Scale += 128;

		Res = scalbn(Res, Scale, RoundingMode::NearestTiesToEven);
		return IC.replaceInstUsesWith(II, ConstantFP::get(Src->getType(), Res));
		}
case Intrinsic::amdgcn_fmul_legacy: {		case Intrinsic::amdgcn_fmul_legacy: {
Value *Op0 = II.getArgOperand(0);		Value *Op0 = II.getArgOperand(0);
Value *Op1 = II.getArgOperand(1);		Value *Op1 = II.getArgOperand(1);

// The legacy behaviour is that multiplying +/-0.0 by anything, even NaN or		// The legacy behaviour is that multiplying +/-0.0 by anything, even NaN or
// infinity, gives +0.0.		// infinity, gives +0.0.
// TODO: Move to InstSimplify?		// TODO: Move to InstSimplify?
if (match(Op0, PatternMatch::m_AnyZeroFP()) \|\|		if (match(Op0, PatternMatch::m_AnyZeroFP()) \|\|
▲ Show 20 Lines • Show All 226 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 5,472 Lines • ▼ Show 20 Lines
	; llvm.amdgcn.trig.preop			; llvm.amdgcn.trig.preop
	; --------------------------------------------------------------------			; --------------------------------------------------------------------

	declare double @llvm.amdgcn.trig.preop.f64(double, i32)			declare double @llvm.amdgcn.trig.preop.f64(double, i32)
	declare float @llvm.amdgcn.trig.preop.f32(float, i32)			declare float @llvm.amdgcn.trig.preop.f32(float, i32)

	define double @trig_preop_constfold_variable_undef_arg(i32 %arg) {			define double @trig_preop_constfold_variable_undef_arg(i32 %arg) {
	; CHECK-LABEL: @trig_preop_constfold_variable_undef_arg(			; CHECK-LABEL: @trig_preop_constfold_variable_undef_arg(
	; CHECK-NEXT: [[VAL:%.]] = call double @llvm.amdgcn.trig.preop.f64(double undef, i32 [[ARG:%.]])			; CHECK-NEXT: ret double 0x7FF8000000000000
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double undef, i32 %arg)			%val = call double @llvm.amdgcn.trig.preop.f64(double undef, i32 %arg)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_variable_poison_arg(i32 %arg) {			define double @trig_preop_constfold_variable_poison_arg(i32 %arg) {
	; CHECK-LABEL: @trig_preop_constfold_variable_poison_arg(			; CHECK-LABEL: @trig_preop_constfold_variable_poison_arg(
	; CHECK-NEXT: [[VAL:%.]] = call double @llvm.amdgcn.trig.preop.f64(double poison, i32 [[ARG:%.]])			; CHECK-NEXT: ret double poison
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double poison, i32 %arg)			%val = call double @llvm.amdgcn.trig.preop.f64(double poison, i32 %arg)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_variable_arg_undef(double %arg) {			define double @trig_preop_constfold_variable_arg_undef(double %arg) {
	; CHECK-LABEL: @trig_preop_constfold_variable_arg_undef(			; CHECK-LABEL: @trig_preop_constfold_variable_arg_undef(
	; CHECK-NEXT: [[VAL:%.]] = call double @llvm.amdgcn.trig.preop.f64(double [[ARG:%.]], i32 undef)			; CHECK-NEXT: [[VAL:%.]] = call double @llvm.amdgcn.trig.preop.f64(double [[ARG:%.]], i32 undef)
	Show All 18 Lines
	; CHECK-NEXT: ret double [[VAL]]			; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 %arg)			%val = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 %arg)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_qnan(i32 %arg) {			define double @trig_preop_qnan(i32 %arg) {
	; CHECK-LABEL: @trig_preop_qnan(			; CHECK-LABEL: @trig_preop_qnan(
	; CHECK-NEXT: [[VAL:%.]] = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF8000000000000, i32 [[ARG:%.]])			; CHECK-NEXT: ret double 0x7FF8000000000000
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF8000000000000, i32 %arg)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF8000000000000, i32 %arg)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_snan(i32 %arg) {			define double @trig_preop_snan(i32 %arg) {
	; CHECK-LABEL: @trig_preop_snan(			; CHECK-LABEL: @trig_preop_snan(
	; CHECK-NEXT: [[VAL:%.]] = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF0000000000001, i32 [[ARG:%.]])			; CHECK-NEXT: ret double 0x7FF8000000000001
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF0000000000001, i32 %arg)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF0000000000001, i32 %arg)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_inf_0() {			define double @trig_preop_inf_0() {
	; CHECK-LABEL: @trig_preop_inf_0(			; CHECK-LABEL: @trig_preop_inf_0(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF0000000000000, i32 0)			; CHECK-NEXT: ret double 0xB43DD63F5F2F8BD
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF0000000000000, i32 0)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0x7FF0000000000000, i32 0)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_ninf_0() {			define double @trig_preop_ninf_0() {
	; CHECK-LABEL: @trig_preop_ninf_0(			; CHECK-LABEL: @trig_preop_ninf_0(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0xFFF0000000000000, i32 0)			; CHECK-NEXT: ret double 0xB43DD63F5F2F8BD
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0xFFF0000000000000, i32 0)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0xFFF0000000000000, i32 0)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_variable_fp(double %arg) {			define double @trig_preop_variable_fp(double %arg) {
	; CHECK-LABEL: @trig_preop_variable_fp(			; CHECK-LABEL: @trig_preop_variable_fp(
	; CHECK-NEXT: [[VAL:%.]] = call double @llvm.amdgcn.trig.preop.f64(double [[ARG:%.]], i32 5)			; CHECK-NEXT: [[VAL:%.]] = call double @llvm.amdgcn.trig.preop.f64(double [[ARG:%.]], i32 5)
	Show All 9 Lines
	; CHECK-NEXT: ret double [[VAL]]			; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double %arg0, i32 %arg1)			%val = call double @llvm.amdgcn.trig.preop.f64(double %arg0, i32 %arg1)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold() {			define double @trig_preop_constfold() {
	; CHECK-LABEL: @trig_preop_constfold(			; CHECK-LABEL: @trig_preop_constfold(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5)			; CHECK-NEXT: ret double 0x2F42371D2126E970
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5)			%val = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_strictfp() {			define double @trig_preop_constfold_strictfp() {
	; CHECK-LABEL: @trig_preop_constfold_strictfp(			; CHECK-LABEL: @trig_preop_constfold_strictfp(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5) #[[ATTR16]]			; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5) #[[ATTR16]]
	; CHECK-NEXT: ret double [[VAL]]			; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5) strictfp			%val = call double @llvm.amdgcn.trig.preop.f64(double 3.454350e+02, i32 5) strictfp
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0.0__0() {			define double @trig_preop_constfold_0.0__0() {
	; CHECK-LABEL: @trig_preop_constfold_0.0__0(			; CHECK-LABEL: @trig_preop_constfold_0.0__0(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0.000000e+00, i32 0)			; CHECK-NEXT: ret double 0x3FE45F306DC9C882
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 0)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 0)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0.0__1() {			define double @trig_preop_constfold_0.0__1() {
	; CHECK-LABEL: @trig_preop_constfold_0.0__1(			; CHECK-LABEL: @trig_preop_constfold_0.0__1(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0.000000e+00, i32 1)			; CHECK-NEXT: ret double 0x3C94A7F09D5F47D4
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 1)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 1)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0.0__neg1() {			define double @trig_preop_constfold_0.0__neg1() {
	; CHECK-LABEL: @trig_preop_constfold_0.0__neg1(			; CHECK-LABEL: @trig_preop_constfold_0.0__neg1(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0.000000e+00, i32 -1)			; CHECK-NEXT: ret double 0.000000e+00
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 -1)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 -1)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0.0__9999999() {			define double @trig_preop_constfold_0.0__9999999() {
	; CHECK-LABEL: @trig_preop_constfold_0.0__9999999(			; CHECK-LABEL: @trig_preop_constfold_0.0__9999999(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0.000000e+00, i32 9999999)			; CHECK-NEXT: ret double 0.000000e+00
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 9999999)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 9999999)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0.0__neg999999() {			define double @trig_preop_constfold_0.0__neg999999() {
	; CHECK-LABEL: @trig_preop_constfold_0.0__neg999999(			; CHECK-LABEL: @trig_preop_constfold_0.0__neg999999(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0.000000e+00, i32 -999999)			; CHECK-NEXT: ret double 0x7FF0000000000000
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 -999999)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0.0, i32 -999999)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0x0020000000000000_0() {			define double @trig_preop_constfold_0x0020000000000000_0() {
	; CHECK-LABEL: @trig_preop_constfold_0x0020000000000000_0(			; CHECK-LABEL: @trig_preop_constfold_0x0020000000000000_0(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0x10000000000000, i32 0)			; CHECK-NEXT: ret double 0x3FE45F306DC9C882
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0x0010000000000000, i32 0)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0x0010000000000000, i32 0)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0x001fffffffffffff_0() {			define double @trig_preop_constfold_0x001fffffffffffff_0() {
	; CHECK-LABEL: @trig_preop_constfold_0x001fffffffffffff_0(			; CHECK-LABEL: @trig_preop_constfold_0x001fffffffffffff_0(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0xFFFFFFFFFFFFF, i32 0)			; CHECK-NEXT: ret double 0x3FE45F306DC9C882
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0x000fffffffffffff, i32 0)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0x000fffffffffffff, i32 0)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0x8020000000000000_0() {			define double @trig_preop_constfold_0x8020000000000000_0() {
	; CHECK-LABEL: @trig_preop_constfold_0x8020000000000000_0(			; CHECK-LABEL: @trig_preop_constfold_0x8020000000000000_0(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0x8020000000000000, i32 0)			; CHECK-NEXT: ret double 0x3FE45F306DC9C882
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0x8020000000000000, i32 0)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0x8020000000000000, i32 0)
	ret double %val			ret double %val
	}			}

	define double @trig_preop_constfold_0x801fffffffffffff_0() {			define double @trig_preop_constfold_0x801fffffffffffff_0() {
	; CHECK-LABEL: @trig_preop_constfold_0x801fffffffffffff_0(			; CHECK-LABEL: @trig_preop_constfold_0x801fffffffffffff_0(
	; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.trig.preop.f64(double 0x801FFFFFFFFFFFFF, i32 0)			; CHECK-NEXT: ret double 0x3FE45F306DC9C882
				arsenmAuthorUnsubmitted Done Reply Inline Actions I feel like the tests are lacking in sample values but coming up with ones to test every point in the table seems exhausting arsenm: I feel like the tests are lacking in sample values but coming up with ones to test every point…
	; CHECK-NEXT: ret double [[VAL]]
	;			;
	%val = call double @llvm.amdgcn.trig.preop.f64(double 0x801fffffffffffff, i32 0)			%val = call double @llvm.amdgcn.trig.preop.f64(double 0x801fffffffffffff, i32 0)
	ret double %val			ret double %val
	}			}