This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Basic folds for fmed3 intrinsic
ClosedPublic

Authored by arsenm on Jan 31 2017, 10:36 AM.

Download Raw Diff

Details

Reviewers

artem.tamazov
majnemer
• tstellarAMD

Summary

Constant fold, canonicalize constants to RHS,
reduce to minnum/maxnum when inputs are nan/undef.

Diff Detail

Event Timeline

arsenm created this revision.Jan 31 2017, 10:36 AM

Herald edited edge metadata. · View Herald TranscriptJan 31 2017, 10:36 AM

Herald added subscribers: tpr, tony-tye, yaxunl and 2 others. · View Herald Transcript

ping

arsenm added a reviewer: artem.tamazov.Feb 22 2017, 11:05 AM

Looks good, but IEEE-754 correctness needs to be verified. Is IEEE compliance required for llvm.amdgcn.fmed3.f32? If it is, we shall look to formal definition of fmed3 and check carefully.

For example, transformations like fmed3(0.0, 1.0, x) -> fmed3(x, 0.0, 1.0) may be non-IEEE-compliant w.r.t. sNANs when shader is in IEEE mode. That depends on expected semantics of fmed3, of course. For example, this is how V_MED3_F semantics is defined for Gfx8:

If (isNan(Src0) || isNan(Src1) || isNan(Src2))
  Result = MIN3(Src0, Src1, Src2)
Else if (MAX3(Src0, Src1, Src2) == Src0)
  Result = MAX(Src1, Src2)
Else if (MAX3(Src0, Src1, Src2) == Src1)
  Result = MAX(Src0, Src2)
Else
  Result = MAX(Src0, Src1)

Clarification:

In D29338#687325, @artem.tamazov wrote:

...Is IEEE compliance required for llvm.amdgcn.fmed3.f32? If it is, we shall look to formal definition of fmed3 and check carefully.
For example, transformations like fmed3(0.0, 1.0, x) -> fmed3(x, 0.0, 1.0) may be non-IEEE-compliant w.r.t. sNANs when shader is in IEEE mode.
That depends on expected semantics of fmed3, of course. For example, this is how V_MED3_F semantics is defined for Gfx8...

...and, in IEEE mode, V_MED3_F32(0.0, 1.0, sNAN) yelds qNAN, while V_MED3_F32(sNAN, 0.0, 1.0) produces 1.0.

In D29338#687325, @artem.tamazov wrote:
Looks good, but IEEE-754 correctness needs to be verified. Is IEEE compliance required for llvm.amdgcn.fmed3.f32? If it is, we shall look to formal definition of fmed3 and check carefully.

For example, transformations like fmed3(0.0, 1.0, x) -> fmed3(x, 0.0, 1.0) may be non-IEEE-compliant w.r.t. sNANs when shader is in IEEE mode. That depends on expected semantics of fmed3, of course. For example, this is how V_MED3_F semantics is defined for Gfx8:
If (isNan(Src0) || isNan(Src1) || isNan(Src2))
  Result = MIN3(Src0, Src1, Src2)
Else if (MAX3(Src0, Src1, Src2) == Src0)
  Result = MAX(Src1, Src2)
Else if (MAX3(Src0, Src1, Src2) == Src1)
  Result = MAX(Src0, Src2)
Else
  Result = MAX(Src0, Src1)

It should match the instruction behavior, but we don't necessarily care about it treating signaling NaNs correctly though. LLVM in general isn't aware of them and breaks their behavior everywhere. The new constrained FP intrinsics should be aware of proper snan behavior though. When we have a complete set of constrained FP intrinsics and when people start using them, we could add a constrained version which would need to properly handle sNaNs. As far as this intrinsic is concerned, as long as it preserves general NaN behavior ignoring quieting etc. that should OK

In D29338#687608, @arsenm wrote:

As far as this intrinsic is concerned, as long as it preserves general NaN behavior ignoring quieting etc. that should OK

All right. I just would like to make the case clear. When shader is in IEEE mode, this intrinsic does not preserve NAN behavior for some cases, e.g. if (x == sNAN), then (fmed3(0,1,x) == qNAN), but (fmed3(x,0,1) == 1). This is OK until we do not try to fold OpenCL constructs like

if (fmax(fmax(a, b), c) == a)
  d = fmax(b, c);
else if (fmax(fmax(a, b), c) == b)
  d = fmax(a, c);
else
  d = fmax(a, b);

d = llvm.amdgcn.fmed3(a, b , c);

This revision is now accepted and ready to land.Feb 27 2017, 12:46 PM

r296409

Revision Contents

Path

Size

include/

llvm/

IR/

IRBuilder.h

16 lines

PatternMatch.h

13 lines

lib/

IR/

IRBuilder.cpp

8 lines

Transforms/

InstCombine/

InstCombineCalls.cpp

71 lines

test/

Transforms/

InstCombine/

amdgcn-intrinsics.ll

182 lines

Diff 86458

include/llvm/IR/IRBuilder.h

Show First 20 Lines • Show All 554 Lines • ▼ Show 20 Lines	public:
/// \brief Create a call to the experimental.gc.relocate intrinsics to		/// \brief Create a call to the experimental.gc.relocate intrinsics to
/// project the relocated value of one pointer from the statepoint.		/// project the relocated value of one pointer from the statepoint.
CallInst CreateGCRelocate(Instruction Statepoint,		CallInst CreateGCRelocate(Instruction Statepoint,
int BaseOffset,		int BaseOffset,
int DerivedOffset,		int DerivedOffset,
Type *ResultType,		Type *ResultType,
const Twine &Name = "");		const Twine &Name = "");

		/// Create a call to intrinsic \p ID with 2 operands which is mangled on the
		/// first type.
		CallInst *CreateBinaryIntrinsic(Intrinsic::ID ID,
		Value LHS, Value RHS,
		const Twine &Name = "");

		/// Create call to the minnum intrinsic.
		CallInst CreateMinNum(Value LHS, Value *RHS, const Twine &Name = "") {
		return CreateBinaryIntrinsic(Intrinsic::minnum, LHS, RHS, Name);
		}

		/// Create call to the maxnum intrinsic.
		CallInst CreateMaxNum(Value LHS, Value *RHS, const Twine &Name = "") {
		return CreateBinaryIntrinsic(Intrinsic::minnum, LHS, RHS, Name);
		}

private:		private:
/// \brief Create a call to a masked intrinsic with given Id.		/// \brief Create a call to a masked intrinsic with given Id.
CallInst CreateMaskedIntrinsic(Intrinsic::ID Id, ArrayRef<Value > Ops,		CallInst CreateMaskedIntrinsic(Intrinsic::ID Id, ArrayRef<Value > Ops,
ArrayRef<Type *> OverloadedTypes,		ArrayRef<Type *> OverloadedTypes,
const Twine &Name = "");		const Twine &Name = "");

Value getCastedInt8PtrValue(Value Ptr);		Value getCastedInt8PtrValue(Value Ptr);
};		};
▲ Show 20 Lines • Show All 1,267 Lines • Show Last 20 Lines

include/llvm/IR/PatternMatch.h

	Show First 20 Lines • Show All 151 Lines • ▼ Show 20 Lines

	/// \brief - Match an arbitrary zero/null constant. This includes			/// \brief - Match an arbitrary zero/null constant. This includes
	/// zero_initializer for vectors and ConstantPointerNull for pointers. For			/// zero_initializer for vectors and ConstantPointerNull for pointers. For
	/// floating point constants, this will match negative zero and positive zero			/// floating point constants, this will match negative zero and positive zero
	inline match_combine_or<match_zero, match_neg_zero> m_AnyZero() {			inline match_combine_or<match_zero, match_neg_zero> m_AnyZero() {
	return m_CombineOr(m_Zero(), m_NegZero());			return m_CombineOr(m_Zero(), m_NegZero());
	}			}

				struct match_nan {
				template <typename ITy> bool match(ITy *V) {
				if (const auto *C = dyn_cast<ConstantFP>(V)) {
				const APFloat &APF = C->getValueAPF();
				return APF.isNaN();
				}
				return false;
				}
				};

				/// Match an arbitrary NaN constant. This includes quiet and signalling nans.
				inline match_nan m_NaN() { return match_nan(); }

	struct apint_match {			struct apint_match {
	const APInt *&Res;			const APInt *&Res;
	apint_match(const APInt *&R) : Res(R) {}			apint_match(const APInt *&R) : Res(R) {}
	template <typename ITy> bool match(ITy *V) {			template <typename ITy> bool match(ITy *V) {
	if (auto *CI = dyn_cast<ConstantInt>(V)) {			if (auto *CI = dyn_cast<ConstantInt>(V)) {
	Res = &CI->getValue();			Res = &CI->getValue();
	return true;			return true;
	}			}
	▲ Show 20 Lines • Show All 1,229 Lines • Show Last 20 Lines

lib/IR/IRBuilder.cpp

	Show First 20 Lines • Show All 476 Lines • ▼ Show 20 Lines
	Value *FnGCRelocate =			Value *FnGCRelocate =
	Intrinsic::getDeclaration(M, Intrinsic::experimental_gc_relocate, Types);			Intrinsic::getDeclaration(M, Intrinsic::experimental_gc_relocate, Types);

	Value *Args[] = {Statepoint,			Value *Args[] = {Statepoint,
	getInt32(BaseOffset),			getInt32(BaseOffset),
	getInt32(DerivedOffset)};			getInt32(DerivedOffset)};
	return createCallHelper(FnGCRelocate, Args, this, Name);			return createCallHelper(FnGCRelocate, Args, this, Name);
	}			}

				CallInst *IRBuilderBase::CreateBinaryIntrinsic(Intrinsic::ID ID,
				Value LHS, Value RHS,
				const Twine &Name) {
				Module *M = BB->getParent()->getParent();
				Function *Fn = Intrinsic::getDeclaration(M, ID, { LHS->getType() });
				return createCallHelper(Fn, { LHS, RHS }, this, Name);
				}

lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 1,449 Lines • ▼ Show 20 Lines	static bool simplifyX86MaskedStore(IntrinsicInst &II, InstCombiner &IC) {

IC.Builder->CreateMaskedStore(Vec, PtrCast, 1, BoolMask);		IC.Builder->CreateMaskedStore(Vec, PtrCast, 1, BoolMask);

// 'Replace uses' doesn't work for stores. Erase the original masked store.		// 'Replace uses' doesn't work for stores. Erase the original masked store.
IC.eraseInstFromFunction(II);		IC.eraseInstFromFunction(II);
return true;		return true;
}		}

		// Constant fold llvm.amdgcn.fmed3 intrinsics for standard inputs.
		//
		// A single NaN input is folded to minnum, so we rely on that folding for
		// handling NaNs.
		static APFloat fmed3AMDGCN(const APFloat &Src0, const APFloat &Src1,
		const APFloat &Src2) {
		APFloat Max3 = maxnum(maxnum(Src0, Src1), Src2);

		APFloat::cmpResult Cmp0 = Max3.compare(Src0);
		assert(Cmp0 != APFloat::cmpUnordered && "nans handled separately");
		if (Cmp0 == APFloat::cmpEqual)
		return maxnum(Src1, Src2);

		APFloat::cmpResult Cmp1 = Max3.compare(Src1);
		assert(Cmp1 != APFloat::cmpUnordered && "nans handled separately");
		if (Cmp1 == APFloat::cmpEqual)
		return maxnum(Src0, Src2);

		return maxnum(Src0, Src1);
		}

// Returns true iff the 2 intrinsics have the same operands, limiting the		// Returns true iff the 2 intrinsics have the same operands, limiting the
// comparison to the first NumOperands.		// comparison to the first NumOperands.
static bool haveSameOperands(const IntrinsicInst &I, const IntrinsicInst &E,		static bool haveSameOperands(const IntrinsicInst &I, const IntrinsicInst &E,
unsigned NumOperands) {		unsigned NumOperands) {
assert(I.getNumArgOperands() >= NumOperands && "Not enough operands");		assert(I.getNumArgOperands() >= NumOperands && "Not enough operands");
assert(E.getNumArgOperands() >= NumOperands && "Not enough operands");		assert(E.getNumArgOperands() >= NumOperands && "Not enough operands");
for (unsigned i = 0; i < NumOperands; i++)		for (unsigned i = 0; i < NumOperands; i++)
if (I.getArgOperand(i) != E.getArgOperand(i))		if (I.getArgOperand(i) != E.getArgOperand(i))
▲ Show 20 Lines • Show All 1,633 Lines • ▼ Show 20 Lines	bool Result =
((Mask & N_ZERO) && Val.isZero() && Val.isNegative()) \|\|		((Mask & N_ZERO) && Val.isZero() && Val.isNegative()) \|\|
((Mask & P_ZERO) && Val.isZero() && !Val.isNegative()) \|\|		((Mask & P_ZERO) && Val.isZero() && !Val.isNegative()) \|\|
((Mask & P_SUBNORMAL) && Val.isDenormal() && !Val.isNegative()) \|\|		((Mask & P_SUBNORMAL) && Val.isDenormal() && !Val.isNegative()) \|\|
((Mask & P_NORMAL) && Val.isNormal() && !Val.isNegative()) \|\|		((Mask & P_NORMAL) && Val.isNormal() && !Val.isNegative()) \|\|
((Mask & P_INFINITY) && Val.isInfinity() && !Val.isNegative());		((Mask & P_INFINITY) && Val.isInfinity() && !Val.isNegative());

return replaceInstUsesWith(*II, ConstantInt::get(II->getType(), Result));		return replaceInstUsesWith(*II, ConstantInt::get(II->getType(), Result));
}		}
		case Intrinsic::amdgcn_fmed3: {
		Value *Src0 = II->getArgOperand(0);
		Value *Src1 = II->getArgOperand(1);
		Value *Src2 = II->getArgOperand(2);

		bool Swap = false;
		// Canonicalize constants to RHS operands
		// fmed3(c0, x, c1) -> fmed3(x, c0, c1)
		if (isa<Constant>(Src0) && !isa<Constant>(Src1)) {
		std::swap(Src0, Src1);
		Swap = true;
		}

		if (isa<Constant>(Src1) && !isa<Constant>(Src2)) {
		std::swap(Src1, Src2);
		Swap = true;
		}

		if (isa<Constant>(Src0) && !isa<Constant>(Src1)) {
		std::swap(Src0, Src1);
		Swap = true;
		}

		if (Swap) {
		II->setArgOperand(0, Src0);
		II->setArgOperand(1, Src1);
		II->setArgOperand(2, Src2);
		return II;
		}

		if (match(Src2, m_NaN()) \|\| isa<UndefValue>(Src2)) {
		CallInst *NewCall = Builder->CreateMinNum(Src0, Src1);
		NewCall->copyFastMathFlags(II);
		NewCall->takeName(II);
		return replaceInstUsesWith(*II, NewCall);
		}

		if (const ConstantFP *C0 = dyn_cast<ConstantFP>(Src0)) {
		if (const ConstantFP *C1 = dyn_cast<ConstantFP>(Src1)) {
		if (const ConstantFP *C2 = dyn_cast<ConstantFP>(Src2)) {
		APFloat Result = fmed3AMDGCN(C0->getValueAPF(), C1->getValueAPF(),
		C2->getValueAPF());
		return replaceInstUsesWith(*II,
		ConstantFP::get(Builder->getContext(), Result));
		}
		}
		}

		break;
		}
case Intrinsic::stackrestore: {		case Intrinsic::stackrestore: {
// If the save is right next to the restore, remove the restore. This can		// If the save is right next to the restore, remove the restore. This can
// happen when variable allocas are DCE'd.		// happen when variable allocas are DCE'd.
if (IntrinsicInst *SS = dyn_cast<IntrinsicInst>(II->getArgOperand(0))) {		if (IntrinsicInst *SS = dyn_cast<IntrinsicInst>(II->getArgOperand(0))) {
if (SS->getIntrinsicID() == Intrinsic::stacksave) {		if (SS->getIntrinsicID() == Intrinsic::stacksave) {
if (&*++SS->getIterator() == II)		if (&*++SS->getIterator() == II)
return eraseInstFromFunction(CI);		return eraseInstFromFunction(CI);
}		}
▲ Show 20 Lines • Show All 857 Lines • Show Last 20 Lines

test/Transforms/InstCombine/amdgcn-intrinsics.ll

	Show First 20 Lines • Show All 627 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: %cos = call float @llvm.amdgcn.cos.f32(float %x)			; CHECK-NEXT: %cos = call float @llvm.amdgcn.cos.f32(float %x)
	; CHECK-NEXT: ret float %cos			; CHECK-NEXT: ret float %cos
	define float @cos_fabs_fneg_f32(float %x) {			define float @cos_fabs_fneg_f32(float %x) {
	%x.fabs = call float @llvm.fabs.f32(float %x)			%x.fabs = call float @llvm.fabs.f32(float %x)
	%x.fabs.fneg = fsub float -0.0, %x.fabs			%x.fabs.fneg = fsub float -0.0, %x.fabs
	%cos = call float @llvm.amdgcn.cos.f32(float %x.fabs.fneg)			%cos = call float @llvm.amdgcn.cos.f32(float %x.fabs.fneg)
	ret float %cos			ret float %cos
	}			}

				; --------------------------------------------------------------------
				; llvm.amdgcn.fmed3
				; --------------------------------------------------------------------

				declare float @llvm.amdgcn.fmed3.f32(float, float, float) nounwind readnone

				; CHECK-LABEL: @fmed3_f32(
				; CHECK: %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float %z)
				define float @fmed3_f32(float %x, float %y, float %z) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float %z)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_canonicalize_x_c0_c1_f32(
				; CHECK: call float @llvm.amdgcn.fmed3.f32(float %x, float 0.000000e+00, float 1.000000e+00)
				define float @fmed3_canonicalize_x_c0_c1_f32(float %x) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float 0.0, float 1.0)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_canonicalize_c0_x_c1_f32(
				; CHECK: call float @llvm.amdgcn.fmed3.f32(float %x, float 0.000000e+00, float 1.000000e+00)
				define float @fmed3_canonicalize_c0_x_c1_f32(float %x) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0.0, float %x, float 1.0)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_canonicalize_c0_c1_x_f32(
				; CHECK: call float @llvm.amdgcn.fmed3.f32(float %x, float 0.000000e+00, float 1.000000e+00)
				define float @fmed3_canonicalize_c0_c1_x_f32(float %x) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0.0, float 1.0, float %x)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_canonicalize_x_y_c_f32(
				; CHECK: call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 1.000000e+00)
				define float @fmed3_canonicalize_x_y_c_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 1.0)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_canonicalize_x_c_y_f32(
				; CHECK: %med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 1.000000e+00)
				define float @fmed3_canonicalize_x_c_y_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float 1.0, float %y)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_canonicalize_c_x_y_f32(
				; CHECK: call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 1.000000e+00)
				define float @fmed3_canonicalize_c_x_y_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 1.0, float %x, float %y)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_undef_x_y_f32(
				; CHECK: call float @llvm.minnum.f32(float %x, float %y)
				define float @fmed3_undef_x_y_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float undef, float %x, float %y)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_fmf_undef_x_y_f32(
				; CHECK: call nnan float @llvm.minnum.f32(float %x, float %y)
				define float @fmed3_fmf_undef_x_y_f32(float %x, float %y) {
				%med3 = call nnan float @llvm.amdgcn.fmed3.f32(float undef, float %x, float %y)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_x_undef_y_f32(
				; CHECK: call float @llvm.minnum.f32(float %x, float %y)
				define float @fmed3_x_undef_y_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float undef, float %y)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_x_y_undef_f32(
				; CHECK: call float @llvm.minnum.f32(float %x, float %y)
				define float @fmed3_x_y_undef_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float undef)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_qnan0_x_y_f32(
				; CHECK: call float @llvm.minnum.f32(float %x, float %y)
				define float @fmed3_qnan0_x_y_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8000000000000, float %x, float %y)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_x_qnan0_y_f32(
				; CHECK: call float @llvm.minnum.f32(float %x, float %y)
				define float @fmed3_x_qnan0_y_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float 0x7FF8000000000000, float %y)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_x_y_qnan0_f32(
				; CHECK: call float @llvm.minnum.f32(float %x, float %y)
				define float @fmed3_x_y_qnan0_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float %y, float 0x7FF8000000000000)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_qnan1_x_y_f32(
				; CHECK: call float @llvm.minnum.f32(float %x, float %y)
				define float @fmed3_qnan1_x_y_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8000100000000, float %x, float %y)
				ret float %med3
				}

				; This can return any of the qnans.
				; CHECK-LABEL: @fmed3_qnan0_qnan1_qnan2_f32(
				; CHECK: ret float 0x7FF8002000000000
				define float @fmed3_qnan0_qnan1_qnan2_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8000100000000, float 0x7FF8002000000000, float 0x7FF8030000000000)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_constant_src0_0_f32(
				; CHECK: ret float 5.000000e-01
				define float @fmed3_constant_src0_0_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0.5, float -1.0, float 4.0)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_constant_src0_1_f32(
				; CHECK: ret float 5.000000e-01
				define float @fmed3_constant_src0_1_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0.5, float 4.0, float -1.0)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_constant_src1_0_f32(
				; CHECK: ret float 5.000000e-01
				define float @fmed3_constant_src1_0_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float -1.0, float 0.5, float 4.0)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_constant_src1_1_f32(
				; CHECK: ret float 5.000000e-01
				define float @fmed3_constant_src1_1_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 4.0, float 0.5, float -1.0)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_constant_src2_0_f32(
				; CHECK: ret float 5.000000e-01
				define float @fmed3_constant_src2_0_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float -1.0, float 4.0, float 0.5)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_constant_src2_1_f32(
				; CHECK: ret float 5.000000e-01
				define float @fmed3_constant_src2_1_f32(float %x, float %y) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 4.0, float -1.0, float 0.5)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_x_qnan0_qnan1_f32(
				; CHECK: ret float %x
				define float @fmed3_x_qnan0_qnan1_f32(float %x) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float %x, float 0x7FF8001000000000, float 0x7FF8002000000000)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_qnan0_x_qnan1_f32(
				; CHECK: ret float %x
				define float @fmed3_qnan0_x_qnan1_f32(float %x) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8001000000000, float %x, float 0x7FF8002000000000)
				ret float %med3
				}

				; CHECK-LABEL: @fmed3_qnan0_qnan1_x_f32(
				; CHECK: ret float %x
				define float @fmed3_qnan0_qnan1_x_f32(float %x) {
				%med3 = call float @llvm.amdgcn.fmed3.f32(float 0x7FF8001000000000, float 0x7FF8002000000000, float %x)
				ret float %med3
				}