This is an archive of the discontinued LLVM Phabricator instance.

[clang/X86] Do not perform FMA bydefault
Needs RevisionPublic

Authored by long5hot on Aug 23 2023, 9:07 AM.

Download Raw Diff

Details

Reviewers

craig.topper
arsenm
andrew.w.kaylor

Summary

Clang fold the fma constants for -ffp-contract=on ,but we have the a catch here,
what if some of intel targets doesn't support fma inst's.

in this case lets bail out with const folding.

Diff Detail

Event Timeline

long5hot created this revision.Aug 23 2023, 9:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 9:07 AM

Herald added a subscriber: pengfei. · View Herald Transcript

long5hot requested review of this revision.Aug 23 2023, 9:07 AM

Harbormaster completed remote builds in B254370: Diff 552747.Aug 23 2023, 10:40 AM

bisecting testcase fexcess-precision.c for default subtarget and AVX512-FP16, because update script update_cc_test_checks.py was removing some checks because of same -check-prefixes..

long5hot updated this revision to Diff 553036.Aug 24 2023, 1:36 AM

long5hot mentioned this in D156344: Disable call to fma for soft-float.Aug 24 2023, 1:54 AM

Harbormaster completed remote builds in B254571: Diff 553036.Aug 24 2023, 2:19 AM

This is a frontend solution to a backend problem. Fmuladd does not guarantee fusion or not and that’s all you have changed the emission of. Clang should not be considering the target for what it should emit. -fp-contract=on emits fmuladd and -fp-contract=fast emits fmul contract fadd contract.

The backend is responsible for deciding whether it’s profitable to fuse or not, where it’s also refined based on the FP type

The description also mentions constant folding which isn’t changed. fmuladd will constant fold as FMA and this is not an issue. That is permitted by the semantics, target preferred codegen doesn’t change that

This revision now requires changes to proceed.Aug 26 2023, 4:30 AM

Herald added a subscriber: wdng. · View Herald TranscriptAug 26 2023, 4:30 AM

In D158632#4619452, @arsenm wrote:

This is a frontend solution to a backend problem. Fmuladd does not guarantee fusion or not and that’s all you have changed the emission of. Clang should not be considering the target for what it should emit. -fp-contract=on emits fmuladd and -fp-contract=fast emits fmul contract fadd contract.

The backend is responsible for deciding whether it’s profitable to fuse or not, where it’s also refined based on the FP type

What i meant to say is, Clang frontend emits fma instrinsic based on the option ffp-contract=on. Now while optimization, It may calculate that fma operation during constant folding.
It Does not check if Target supports fma instructions or not, e.g. (nehalem).
If target doesn't support fma instruciton and emits mul and add then there's a mismatch and loss of precision.

PR : #63190

This has also mentioned by @andrew.w.kaylor in #54927

long5hot added a reviewer: andrew.w.kaylor.Aug 27 2023, 3:34 AM

ping!

In D158632#4640957, @long5hot wrote:

ping!

Not a bug. This is operationally defined. The semantics of the operation should not depend on target preference. This also doesn't consider FMA availability differing based on type

In D158632#4620032, @long5hot wrote:

In D158632#4619452, @arsenm wrote:

This is a frontend solution to a backend problem. Fmuladd does not guarantee fusion or not and that’s all you have changed the emission of. Clang should not be considering the target for what it should emit. -fp-contract=on emits fmuladd and -fp-contract=fast emits fmul contract fadd contract.

The backend is responsible for deciding whether it’s profitable to fuse or not, where it’s also refined based on the FP type

What i meant to say is, Clang frontend emits fma instrinsic based on the option ffp-contract=on. Now while optimization, It may calculate that fma operation during constant folding.
It Does not check if Target supports fma instructions or not, e.g. (nehalem).
If target doesn't support fma instruciton and emits mul and add then there's a mismatch and loss of precision.

PR : #63190

This has also mentioned by @andrew.w.kaylor in #54927

As I just mentioned in https://github.com/llvm/llvm-project/issues/55230, @arsenm has convinced me that I was taking too hard of a stance on this. The use of -fp-contract=on explicitly permits the value change introduced by fused operations, so the fact that constant folding results in a different result than would have been produced by run-time evaluation is consistent with the rules under which the intrinsic was introduced. I'm not entirely convinced that the current behavior is the best we can do, but it is correct given the rules we are following. Users who want tighter numeric consistency should either use -fp-contract=off or make an explicit call to the fma function (which will be evaluated as if fused regardless of target hardware support).

I'm hoping to put together a detailed document on numeric consistency goals and rules for the LLVM optimizer, but I'm not sure when I'll have time to do so. I started a discussion here: https://discourse.llvm.org/t/fp-constant-folding-of-floating-point-operations/73138 I raised the FMA constant folding issue there, and @arsenm explained his reasoning for why either method of evaluation is acceptable.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

TargetInfo.h

3 lines

lib/

Basic/

Targets/

X86.h

6 lines

CodeGen/

CGExprScalar.cpp

7 lines

test/

CodeGen/

X86/

fexcess-precision-avx.c

313 lines

fexcess-precision.c

329 lines

fma-intrinsics.c

38 lines

Diff 553036

clang/include/clang/Basic/TargetInfo.h

Show First 20 Lines • Show All 623 Lines • ▼ Show 20 Lines	public:
/// FIXME: _BitInt is a required type in C23, so there's not much utility in		/// FIXME: _BitInt is a required type in C23, so there's not much utility in
/// asking whether the target supported it or not; I think this should be		/// asking whether the target supported it or not; I think this should be
/// removed once backends have been alerted to the type and have had the		/// removed once backends have been alerted to the type and have had the
/// chance to do implementation work if needed.		/// chance to do implementation work if needed.
virtual bool hasBitIntType() const {		virtual bool hasBitIntType() const {
return false;		return false;
}		}

		/// Determine whether target has the FMA instructions support or not
		virtual bool hasFMA() const { return false; }

// Different targets may support a different maximum width for the _BitInt		// Different targets may support a different maximum width for the _BitInt
// type, depending on what operations are supported.		// type, depending on what operations are supported.
virtual size_t getMaxBitIntWidth() const {		virtual size_t getMaxBitIntWidth() const {
// Consider -fexperimental-max-bitint-width= first.		// Consider -fexperimental-max-bitint-width= first.
if (MaxBitIntWidth)		if (MaxBitIntWidth)
return std::min<size_t>(*MaxBitIntWidth, llvm::IntegerType::MAX_INT_BITS);		return std::min<size_t>(*MaxBitIntWidth, llvm::IntegerType::MAX_INT_BITS);

// FIXME: this value should be llvm::IntegerType::MAX_INT_BITS, which is		// FIXME: this value should be llvm::IntegerType::MAX_INT_BITS, which is
▲ Show 20 Lines • Show All 1,118 Lines • Show Last 20 Lines

clang/lib/Basic/Targets/X86.h

Show First 20 Lines • Show All 413 Lines • ▼ Show 20 Lines	if (TargetAddrSpace == ptr64)
return 64;		return 64;
return PointerWidth;		return PointerWidth;
}		}

uint64_t getPointerAlignV(LangAS AddrSpace) const override {		uint64_t getPointerAlignV(LangAS AddrSpace) const override {
return getPointerWidthV(AddrSpace);		return getPointerWidthV(AddrSpace);
}		}

		bool hasFMA() const override {
		if (HasFMA)
		return true;
		else
		return false;
		}
};		};

// X86-32 generic target		// X86-32 generic target
class LLVM_LIBRARY_VISIBILITY X86_32TargetInfo : public X86TargetInfo {		class LLVM_LIBRARY_VISIBILITY X86_32TargetInfo : public X86TargetInfo {
public:		public:
X86_32TargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)		X86_32TargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
: X86TargetInfo(Triple, Opts) {		: X86TargetInfo(Triple, Opts) {
DoubleAlign = LongLongAlign = 32;		DoubleAlign = LongLongAlign = 32;
▲ Show 20 Lines • Show All 566 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 3,764 Lines • ▼ Show 20 Lines	static Value* buildFMulAdd(llvm::Instruction MulOp, Value Addend,
MulOp->eraseFromParent();		MulOp->eraseFromParent();

return FMulAdd;		return FMulAdd;
}		}

// Check whether it would be legal to emit an fmuladd intrinsic call to		// Check whether it would be legal to emit an fmuladd intrinsic call to
// represent op and if so, build the fmuladd.		// represent op and if so, build the fmuladd.
//		//
// Checks that (a) the operation is fusable, and (b) -ffp-contract=on.		// Checks that (a) the operation is fusable, and (b) -ffp-contract=on and
		// does target supports fma instructions.
// Does NOT check the type of the operation - it's assumed that this function		// Does NOT check the type of the operation - it's assumed that this function
// will be called from contexts where it's known that the type is contractable.		// will be called from contexts where it's known that the type is contractable.
static Value* tryEmitFMulAdd(const BinOpInfo &op,		static Value* tryEmitFMulAdd(const BinOpInfo &op,
const CodeGenFunction &CGF, CGBuilderTy &Builder,		const CodeGenFunction &CGF, CGBuilderTy &Builder,
bool isSub=false) {		bool isSub=false) {

assert((op.Opcode == BO_Add \|\| op.Opcode == BO_AddAssign \|\|		assert((op.Opcode == BO_Add \|\| op.Opcode == BO_AddAssign \|\|
op.Opcode == BO_Sub \|\| op.Opcode == BO_SubAssign) &&		op.Opcode == BO_Sub \|\| op.Opcode == BO_SubAssign) &&
"Only fadd/fsub can be the root of an fmuladd.");		"Only fadd/fsub can be the root of an fmuladd.");

// Check whether this op is marked as fusable.		// Check whether this op is marked as fusable.
if (!op.FPFeatures.allowFPContractWithinStatement())		const bool allowFPContract = op.FPFeatures.allowFPContractWithinStatement();
		const bool allowFMA = CGF.getContext().getTargetInfo().hasFMA();
		if (!allowFPContract && !allowFMA)
return nullptr;		return nullptr;

Value *LHS = op.LHS;		Value *LHS = op.LHS;
Value *RHS = op.RHS;		Value *RHS = op.RHS;

// Peek through fneg to look for fmul. Make sure fneg has no users, and that		// Peek through fneg to look for fmul. Make sure fneg has no users, and that
// it is the only use of its operand.		// it is the only use of its operand.
bool NegLHS = false;		bool NegLHS = false;
▲ Show 20 Lines • Show All 1,628 Lines • Show Last 20 Lines

clang/test/CodeGen/X86/fexcess-precision-avx.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=fast -target-feature +avx512fp16 \
				// RUN: -emit-llvm -o - %s \| FileCheck -check-prefixes=CHECK-NO-EXT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=standard -target-feature +avx512fp16 \
				// RUN: -emit-llvm -o - %s \| FileCheck -check-prefixes=CHECK-NO-EXT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -emit-llvm -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=fast -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=standard -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=fast -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=standard -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=fast -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=standard -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -ffp-contract=on -emit-llvm -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
				// RUN: -ffp-eval-method=source -emit-llvm -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
				// RUN: -ffp-eval-method=double -emit-llvm -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT-DBL %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
				// RUN: -ffp-eval-method=extended -emit-llvm -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT-EXT %s

				// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
				// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
				// RUN: -fapprox-func -fmath-errno -fno-signed-zeros -mreassociate \
				// RUN: -freciprocal-math -ffp-contract=on -fno-rounding-math \
				// RUN: -funsafe-math-optimizations -emit-llvm -o - %s \
				// RUN: \| FileCheck -check-prefixes=CHECK-UNSAFE %s

				// CHECK-EXT-LABEL: define dso_local half @f
				// CHECK-EXT-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
				// CHECK-EXT-NEXT: entry:
				// CHECK-EXT-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
				// CHECK-EXT-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
				// CHECK-EXT-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
				// CHECK-EXT-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
				// CHECK-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
				// CHECK-EXT-NEXT: [[EXT:%.*]] = fpext half [[TMP0]] to float
				// CHECK-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
				// CHECK-EXT-NEXT: [[EXT1:%.*]] = fpext half [[TMP1]] to float
				// CHECK-EXT-NEXT: [[MUL:%.*]] = fmul float [[EXT]], [[EXT1]]
				// CHECK-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
				// CHECK-EXT-NEXT: [[EXT2:%.*]] = fpext half [[TMP2]] to float
				// CHECK-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
				// CHECK-EXT-NEXT: [[EXT3:%.*]] = fpext half [[TMP3]] to float
				// CHECK-EXT-NEXT: [[MUL4:%.*]] = fmul float [[EXT2]], [[EXT3]]
				// CHECK-EXT-NEXT: [[ADD:%.*]] = fadd float [[MUL]], [[MUL4]]
				// CHECK-EXT-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[ADD]] to half
				// CHECK-EXT-NEXT: ret half [[UNPROMOTION]]
				// CHECK-NO-EXT-LABEL: define dso_local half @f
				// CHECK-NO-EXT-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
				// CHECK-NO-EXT-NEXT: entry:
				// CHECK-NO-EXT-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
				// CHECK-NO-EXT-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
				// CHECK-NO-EXT-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
				// CHECK-NO-EXT-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
				// CHECK-NO-EXT-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
				// CHECK-NO-EXT-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
				// CHECK-NO-EXT-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
				// CHECK-NO-EXT-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
				// CHECK-NO-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
				// CHECK-NO-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
				// CHECK-NO-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
				// CHECK-NO-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
				// CHECK-NO-EXT-NEXT: [[MUL1:%.*]] = fmul half [[TMP2]], [[TMP3]]
				// CHECK-NO-EXT-NEXT: [[TMP4:%.*]] = call half @llvm.fmuladd.f16(half [[TMP0]], half [[TMP1]], half [[MUL1]])
				// CHECK-NO-EXT-NEXT: ret half [[TMP4]]
				//
				// CHECK-EXT-DBL-LABEL: define dso_local half @f
				// CHECK-EXT-DBL-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
				// CHECK-EXT-DBL-NEXT: entry:
				// CHECK-EXT-DBL-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-DBL-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-DBL-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-DBL-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-DBL-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
				// CHECK-EXT-DBL-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
				// CHECK-EXT-DBL-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
				// CHECK-EXT-DBL-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
				// CHECK-EXT-DBL-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
				// CHECK-EXT-DBL-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to double
				// CHECK-EXT-DBL-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
				// CHECK-EXT-DBL-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to double
				// CHECK-EXT-DBL-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
				// CHECK-EXT-DBL-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to double
				// CHECK-EXT-DBL-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
				// CHECK-EXT-DBL-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to double
				// CHECK-EXT-DBL-NEXT: [[MUL4:%.*]] = fmul double [[CONV2]], [[CONV3]]
				// CHECK-EXT-DBL-NEXT: [[TMP4:%.*]] = call double @llvm.fmuladd.f64(double [[CONV]], double [[CONV1]], double [[MUL4]])
				// CHECK-EXT-DBL-NEXT: [[CONV5:%.*]] = fptrunc double [[TMP4]] to half
				// CHECK-EXT-DBL-NEXT: ret half [[CONV5]]
				//
				// CHECK-EXT-FP80-LABEL: define dso_local half @f
				// CHECK-EXT-FP80-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
				// CHECK-EXT-FP80-NEXT: entry:
				// CHECK-EXT-FP80-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-FP80-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-FP80-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-FP80-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
				// CHECK-EXT-FP80-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
				// CHECK-EXT-FP80-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
				// CHECK-EXT-FP80-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
				// CHECK-EXT-FP80-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
				// CHECK-EXT-FP80-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
				// CHECK-EXT-FP80-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to x86_fp80
				// CHECK-EXT-FP80-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
				// CHECK-EXT-FP80-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to x86_fp80
				// CHECK-EXT-FP80-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
				// CHECK-EXT-FP80-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to x86_fp80
				// CHECK-EXT-FP80-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
				// CHECK-EXT-FP80-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to x86_fp80
				// CHECK-EXT-FP80-NEXT: [[MUL4:%.*]] = fmul x86_fp80 [[CONV2]], [[CONV3]]
				// CHECK-EXT-FP80-NEXT: [[TMP4:%.*]] = call x86_fp80 @llvm.fmuladd.f80(x86_fp80 [[CONV]], x86_fp80 [[CONV1]], x86_fp80 [[MUL4]])
				// CHECK-EXT-FP80-NEXT: [[CONV5:%.*]] = fptrunc x86_fp80 [[TMP4]] to half
				// CHECK-EXT-FP80-NEXT: ret half [[CONV5]]
				//
				// CHECK-CONTRACT-LABEL: define dso_local half @f
				// CHECK-CONTRACT-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
				// CHECK-CONTRACT-NEXT: entry:
				// CHECK-CONTRACT-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
				// CHECK-CONTRACT-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
				// CHECK-CONTRACT-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
				// CHECK-CONTRACT-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
				// CHECK-CONTRACT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
				// CHECK-CONTRACT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
				// CHECK-CONTRACT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
				// CHECK-CONTRACT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
				// CHECK-CONTRACT-NEXT: [[MUL1:%.*]] = fmul half [[TMP2]], [[TMP3]]
				// CHECK-CONTRACT-NEXT: [[TMP4:%.*]] = call half @llvm.fmuladd.f16(half [[TMP0]], half [[TMP1]], half [[MUL1]])
				// CHECK-CONTRACT-NEXT: ret half [[TMP4]]
				//
				// CHECK-CONTRACT-DBL-LABEL: define dso_local half @f
				// CHECK-CONTRACT-DBL-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
				// CHECK-CONTRACT-DBL-NEXT: entry:
				// CHECK-CONTRACT-DBL-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-DBL-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-DBL-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-DBL-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-DBL-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
				// CHECK-CONTRACT-DBL-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
				// CHECK-CONTRACT-DBL-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
				// CHECK-CONTRACT-DBL-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
				// CHECK-CONTRACT-DBL-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
				// CHECK-CONTRACT-DBL-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to double
				// CHECK-CONTRACT-DBL-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
				// CHECK-CONTRACT-DBL-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to double
				// CHECK-CONTRACT-DBL-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
				// CHECK-CONTRACT-DBL-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to double
				// CHECK-CONTRACT-DBL-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
				// CHECK-CONTRACT-DBL-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to double
				// CHECK-CONTRACT-DBL-NEXT: [[MUL4:%.*]] = fmul double [[CONV2]], [[CONV3]]
				// CHECK-CONTRACT-DBL-NEXT: [[TMP4:%.*]] = call double @llvm.fmuladd.f64(double [[CONV]], double [[CONV1]], double [[MUL4]])
				// CHECK-CONTRACT-DBL-NEXT: [[CONV5:%.*]] = fptrunc double [[TMP4]] to half
				// CHECK-CONTRACT-DBL-NEXT: ret half [[CONV5]]
				//
				// CHECK-CONTRACT-EXT-LABEL: define dso_local half @f
				// CHECK-CONTRACT-EXT-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
				// CHECK-CONTRACT-EXT-NEXT: entry:
				// CHECK-CONTRACT-EXT-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-EXT-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-EXT-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-EXT-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
				// CHECK-CONTRACT-EXT-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
				// CHECK-CONTRACT-EXT-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
				// CHECK-CONTRACT-EXT-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
				// CHECK-CONTRACT-EXT-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
				// CHECK-CONTRACT-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
				// CHECK-CONTRACT-EXT-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to x86_fp80
				// CHECK-CONTRACT-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
				// CHECK-CONTRACT-EXT-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to x86_fp80
				// CHECK-CONTRACT-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
				// CHECK-CONTRACT-EXT-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to x86_fp80
				// CHECK-CONTRACT-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
				// CHECK-CONTRACT-EXT-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to x86_fp80
				// CHECK-CONTRACT-EXT-NEXT: [[MUL4:%.*]] = fmul x86_fp80 [[CONV2]], [[CONV3]]
				// CHECK-CONTRACT-EXT-NEXT: [[TMP4:%.*]] = call x86_fp80 @llvm.fmuladd.f80(x86_fp80 [[CONV]], x86_fp80 [[CONV1]], x86_fp80 [[MUL4]])
				// CHECK-CONTRACT-EXT-NEXT: [[CONV5:%.*]] = fptrunc x86_fp80 [[TMP4]] to half
				// CHECK-CONTRACT-EXT-NEXT: ret half [[CONV5]]
				//
				// CHECK-UNSAFE-LABEL: define dso_local half @f
				// CHECK-UNSAFE-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
				// CHECK-UNSAFE-NEXT: entry:
				// CHECK-UNSAFE-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
				// CHECK-UNSAFE-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
				// CHECK-UNSAFE-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
				// CHECK-UNSAFE-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
				// CHECK-UNSAFE-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
				// CHECK-UNSAFE-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
				// CHECK-UNSAFE-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
				// CHECK-UNSAFE-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
				// CHECK-UNSAFE-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
				// CHECK-UNSAFE-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
				// CHECK-UNSAFE-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
				// CHECK-UNSAFE-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
				// CHECK-UNSAFE-NEXT: [[MUL1:%.*]] = fmul reassoc nsz arcp afn half [[TMP2]], [[TMP3]]
				// CHECK-UNSAFE-NEXT: [[TMP4:%.*]] = call reassoc nsz arcp afn half @llvm.fmuladd.f16(half [[TMP0]], half [[TMP1]], half [[MUL1]])
				// CHECK-UNSAFE-NEXT: ret half [[TMP4]]
				//
				_Float16 f(_Float16 a, _Float16 b, _Float16 c, _Float16 d) {
				return a * b + c * d;
				}

				// CHECK-EXT-LABEL: define dso_local i32 @getFEM
				// CHECK-EXT-SAME: () #[[ATTR0]] {
				// CHECK-EXT-NEXT: entry:
				// CHECK-EXT-NEXT: ret i32 0
				// CHECK-NO-EXT-LABEL: define dso_local i32 @getFEM
				// CHECK-NO-EXT-SAME: () #[[ATTR0]] {
				// CHECK-NO-EXT-NEXT: entry:
				// CHECK-NO-EXT-NEXT: ret i32 0
				//
				// CHECK-EXT-DBL-LABEL: define dso_local i32 @getFEM
				// CHECK-EXT-DBL-SAME: () #[[ATTR0]] {
				// CHECK-EXT-DBL-NEXT: entry:
				// CHECK-EXT-DBL-NEXT: ret i32 1
				//
				// CHECK-EXT-FP80-LABEL: define dso_local i32 @getFEM
				// CHECK-EXT-FP80-SAME: () #[[ATTR0]] {
				// CHECK-EXT-FP80-NEXT: entry:
				// CHECK-EXT-FP80-NEXT: ret i32 2
				//
				// CHECK-CONTRACT-LABEL: define dso_local i32 @getFEM
				// CHECK-CONTRACT-SAME: () #[[ATTR0]] {
				// CHECK-CONTRACT-NEXT: entry:
				// CHECK-CONTRACT-NEXT: ret i32 0
				//
				// CHECK-CONTRACT-DBL-LABEL: define dso_local i32 @getFEM
				// CHECK-CONTRACT-DBL-SAME: () #[[ATTR0]] {
				// CHECK-CONTRACT-DBL-NEXT: entry:
				// CHECK-CONTRACT-DBL-NEXT: ret i32 1
				//
				// CHECK-CONTRACT-EXT-LABEL: define dso_local i32 @getFEM
				// CHECK-CONTRACT-EXT-SAME: () #[[ATTR0]] {
				// CHECK-CONTRACT-EXT-NEXT: entry:
				// CHECK-CONTRACT-EXT-NEXT: ret i32 2
				//
				// CHECK-UNSAFE-LABEL: define dso_local i32 @getFEM
				// CHECK-UNSAFE-SAME: () #[[ATTR0]] {
				// CHECK-UNSAFE-NEXT: entry:
				// CHECK-UNSAFE-NEXT: ret i32 0
				//
				int getFEM() {
				return __FLT_EVAL_METHOD__;
				}

clang/test/CodeGen/X86/fexcess-precision.c

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=fast -emit-llvm -o - %s \			// RUN: -ffloat16-excess-precision=fast -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=fast -target-feature +avx512fp16 \
	// RUN: -emit-llvm -o - %s \| FileCheck -check-prefixes=CHECK-NO-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=standard -emit-llvm -o - %s \			// RUN: -ffloat16-excess-precision=standard -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=standard -target-feature +avx512fp16 \
	// RUN: -emit-llvm -o - %s \| FileCheck -check-prefixes=CHECK-NO-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -emit-llvm -o - %s \			// RUN: -ffloat16-excess-precision=none -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s			// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
	// RUN: -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=fast \			// RUN: -ffloat16-excess-precision=fast \
	// RUN: -emit-llvm -ffp-eval-method=source -o - %s \			// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=fast -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=standard \			// RUN: -ffloat16-excess-precision=standard \
	// RUN: -emit-llvm -ffp-eval-method=source -o - %s \			// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=standard -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none \			// RUN: -ffloat16-excess-precision=none \
	// RUN: -emit-llvm -ffp-eval-method=source -o - %s \			// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s			// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=source -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-NO-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=fast \			// RUN: -ffloat16-excess-precision=fast \
	// RUN: -emit-llvm -ffp-eval-method=double -o - %s \			// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=fast -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=standard \			// RUN: -ffloat16-excess-precision=standard \
	// RUN: -emit-llvm -ffp-eval-method=double -o - %s \			// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=standard -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none \			// RUN: -ffloat16-excess-precision=none \
	// RUN: -emit-llvm -ffp-eval-method=double -o - %s \			// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=double -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-DBL %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=fast \			// RUN: -ffloat16-excess-precision=fast \
	// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \			// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=fast -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=standard \			// RUN: -ffloat16-excess-precision=standard \
	// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \			// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=standard -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none \			// RUN: -ffloat16-excess-precision=none \
	// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \			// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s			// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
	// RUN: -emit-llvm -ffp-eval-method=extended -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-EXT-FP80 %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none \			// RUN: -ffloat16-excess-precision=none \
	// RUN: -ffp-contract=on -emit-llvm -o - %s \			// RUN: -ffp-contract=on -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT %s			// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
	// RUN: -ffp-contract=on -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none \			// RUN: -ffloat16-excess-precision=none \
	// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \			// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
	// RUN: -ffp-eval-method=source -emit-llvm -o - %s \			// RUN: -ffp-eval-method=source -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT %s			// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
	// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
	// RUN: -ffp-eval-method=source -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none \			// RUN: -ffloat16-excess-precision=none \
	// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \			// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
	// RUN: -ffp-eval-method=double -emit-llvm -o - %s \			// RUN: -ffp-eval-method=double -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT-DBL %s			// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT-DBL %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
	// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
	// RUN: -ffp-eval-method=double -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT-DBL %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none \			// RUN: -ffloat16-excess-precision=none \
	// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \			// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
	// RUN: -ffp-eval-method=extended -emit-llvm -o - %s \			// RUN: -ffp-eval-method=extended -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT-EXT %s			// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \
	// RUN: -fmath-errno -ffp-contract=on -fno-rounding-math \
	// RUN: -ffp-eval-method=extended -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-CONTRACT-EXT %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
	// RUN: -ffloat16-excess-precision=none \			// RUN: -ffloat16-excess-precision=none \
	// RUN: -fapprox-func -fmath-errno -fno-signed-zeros -mreassociate \			// RUN: -fapprox-func -fmath-errno -fno-signed-zeros -mreassociate \
	// RUN: -freciprocal-math -ffp-contract=on -fno-rounding-math \			// RUN: -freciprocal-math -ffp-contract=on -fno-rounding-math \
	// RUN: -funsafe-math-optimizations -emit-llvm -o - %s \			// RUN: -funsafe-math-optimizations -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-UNSAFE %s			// RUN: \| FileCheck -check-prefixes=CHECK-UNSAFE %s

	// RUN: %clang_cc1 -triple x86_64-unknown-unknown \			// CHECK-EXT-LABEL: define dso_local half @f
	// RUN: -ffloat16-excess-precision=none -target-feature +avx512fp16 \			// CHECK-EXT-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
	// RUN: -fapprox-func -fmath-errno -fno-signed-zeros -mreassociate \
	// RUN: -freciprocal-math -ffp-contract=on -fno-rounding-math \
	// RUN: -funsafe-math-optimizations -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefixes=CHECK-UNSAFE %s

	// CHECK-EXT-LABEL: @f(
	// CHECK-EXT-NEXT: entry:			// CHECK-EXT-NEXT: entry:
	// CHECK-EXT-NEXT: [[A_ADDR:%.*]] = alloca half			// CHECK-EXT-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-NEXT: [[B_ADDR:%.*]] = alloca half			// CHECK-EXT-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-NEXT: [[C_ADDR:%.*]] = alloca half			// CHECK-EXT-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-NEXT: [[D_ADDR:%.*]] = alloca half			// CHECK-EXT-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-NEXT: store half [[A:%.*]], ptr [[A_ADDR]]			// CHECK-EXT-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
	// CHECK-EXT-NEXT: store half [[B:%.*]], ptr [[B_ADDR]]			// CHECK-EXT-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
	// CHECK-EXT-NEXT: store half [[C:%.*]], ptr [[C_ADDR]]			// CHECK-EXT-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
	// CHECK-EXT-NEXT: store half [[D:%.*]], ptr [[D_ADDR]]			// CHECK-EXT-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
	// CHECK-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]]			// CHECK-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
	// CHECK-EXT-NEXT: [[EXT:%.*]] = fpext half [[TMP0]] to float			// CHECK-EXT-NEXT: [[EXT:%.*]] = fpext half [[TMP0]] to float
	// CHECK-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]]			// CHECK-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
	// CHECK-EXT-NEXT: [[EXT1:%.*]] = fpext half [[TMP1]] to float			// CHECK-EXT-NEXT: [[EXT1:%.*]] = fpext half [[TMP1]] to float
	// CHECK-EXT-NEXT: [[MUL:%.*]] = fmul float [[EXT]], [[EXT1]]			// CHECK-EXT-NEXT: [[MUL:%.*]] = fmul float [[EXT]], [[EXT1]]
	// CHECK-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]]			// CHECK-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
	// CHECK-EXT-NEXT: [[EXT2:%.*]] = fpext half [[TMP2]] to float			// CHECK-EXT-NEXT: [[EXT2:%.*]] = fpext half [[TMP2]] to float
	// CHECK-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]]			// CHECK-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
	// CHECK-EXT-NEXT: [[EXT3:%.*]] = fpext half [[TMP3]] to float			// CHECK-EXT-NEXT: [[EXT3:%.*]] = fpext half [[TMP3]] to float
	// CHECK-EXT-NEXT: [[MUL4:%.*]] = fmul float [[EXT2]], [[EXT3]]			// CHECK-EXT-NEXT: [[MUL4:%.*]] = fmul float [[EXT2]], [[EXT3]]
	// CHECK-EXT-NEXT: [[ADD:%.*]] = fadd float [[MUL]], [[MUL4]]			// CHECK-EXT-NEXT: [[ADD:%.*]] = fadd float [[MUL]], [[MUL4]]
	// CHECK-EXT-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[ADD]] to half			// CHECK-EXT-NEXT: [[UNPROMOTION:%.*]] = fptrunc float [[ADD]] to half
	// CHECK-EXT-NEXT: ret half [[UNPROMOTION]]			// CHECK-EXT-NEXT: ret half [[UNPROMOTION]]
	//			//
	// CHECK-NO-EXT-LABEL: @f(			// CHECK-NO-EXT-LABEL: define dso_local half @f
				// CHECK-NO-EXT-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
	// CHECK-NO-EXT-NEXT: entry:			// CHECK-NO-EXT-NEXT: entry:
	// CHECK-NO-EXT-NEXT: [[A_ADDR:%.*]] = alloca half			// CHECK-NO-EXT-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
	// CHECK-NO-EXT-NEXT: [[B_ADDR:%.*]] = alloca half			// CHECK-NO-EXT-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
	// CHECK-NO-EXT-NEXT: [[C_ADDR:%.*]] = alloca half			// CHECK-NO-EXT-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
	// CHECK-NO-EXT-NEXT: [[D_ADDR:%.*]] = alloca half			// CHECK-NO-EXT-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
	// CHECK-NO-EXT-NEXT: store half [[A:%.*]], ptr [[A_ADDR]]			// CHECK-NO-EXT-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
	// CHECK-NO-EXT-NEXT: store half [[B:%.*]], ptr [[B_ADDR]]			// CHECK-NO-EXT-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
	// CHECK-NO-EXT-NEXT: store half [[C:%.*]], ptr [[C_ADDR]]			// CHECK-NO-EXT-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
	// CHECK-NO-EXT-NEXT: store half [[D:%.*]], ptr [[D_ADDR]]			// CHECK-NO-EXT-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
	// CHECK-NO-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]]			// CHECK-NO-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
	// CHECK-NO-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]]			// CHECK-NO-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
	// CHECK-NO-EXT-NEXT: [[MUL:%.*]] = fmul half [[TMP0]], [[TMP1]]			// CHECK-NO-EXT-NEXT: [[MUL:%.*]] = fmul half [[TMP0]], [[TMP1]]
	// CHECK-NO-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]]			// CHECK-NO-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
	// CHECK-NO-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]]			// CHECK-NO-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
	// CHECK-NO-EXT-NEXT: [[MUL1:%.*]] = fmul half [[TMP2]], [[TMP3]]			// CHECK-NO-EXT-NEXT: [[MUL1:%.*]] = fmul half [[TMP2]], [[TMP3]]
	// CHECK-NO-EXT-NEXT: [[ADD:%.*]] = fadd half [[MUL]], [[MUL1]]			// CHECK-NO-EXT-NEXT: [[ADD:%.*]] = fadd half [[MUL]], [[MUL1]]
	// CHECK-NO-EXT-NEXT: ret half [[ADD]]			// CHECK-NO-EXT-NEXT: ret half [[ADD]]
	//			//
	// CHECK-EXT-DBL-LABEL: @f(			// CHECK-EXT-DBL-LABEL: define dso_local half @f
				// CHECK-EXT-DBL-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
	// CHECK-EXT-DBL-NEXT: entry:			// CHECK-EXT-DBL-NEXT: entry:
	// CHECK-EXT-DBL-NEXT: [[A_ADDR:%.*]] = alloca half			// CHECK-EXT-DBL-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-DBL-NEXT: [[B_ADDR:%.*]] = alloca half			// CHECK-EXT-DBL-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-DBL-NEXT: [[C_ADDR:%.*]] = alloca half			// CHECK-EXT-DBL-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-DBL-NEXT: [[D_ADDR:%.*]] = alloca half			// CHECK-EXT-DBL-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-DBL-NEXT: store half [[A:%.*]], ptr [[A_ADDR]]			// CHECK-EXT-DBL-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
	// CHECK-EXT-DBL-NEXT: store half [[B:%.*]], ptr [[B_ADDR]]			// CHECK-EXT-DBL-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
	// CHECK-EXT-DBL-NEXT: store half [[C:%.*]], ptr [[C_ADDR]]			// CHECK-EXT-DBL-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
	// CHECK-EXT-DBL-NEXT: store half [[D:%.*]], ptr [[D_ADDR]]			// CHECK-EXT-DBL-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
	// CHECK-EXT-DBL-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]]			// CHECK-EXT-DBL-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
	// CHECK-EXT-DBL-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to double			// CHECK-EXT-DBL-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to double
	// CHECK-EXT-DBL-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]]			// CHECK-EXT-DBL-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
	// CHECK-EXT-DBL-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to double			// CHECK-EXT-DBL-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to double
	// CHECK-EXT-DBL-NEXT: [[MUL:%.*]] = fmul double [[CONV]], [[CONV1]]			// CHECK-EXT-DBL-NEXT: [[MUL:%.*]] = fmul double [[CONV]], [[CONV1]]
	// CHECK-EXT-DBL-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]]			// CHECK-EXT-DBL-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
	// CHECK-EXT-DBL-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to double			// CHECK-EXT-DBL-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to double
	// CHECK-EXT-DBL-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]]			// CHECK-EXT-DBL-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
	// CHECK-EXT-DBL-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to double			// CHECK-EXT-DBL-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to double
	// CHECK-EXT-DBL-NEXT: [[MUL4:%.*]] = fmul double [[CONV2]], [[CONV3]]			// CHECK-EXT-DBL-NEXT: [[MUL4:%.*]] = fmul double [[CONV2]], [[CONV3]]
	// CHECK-EXT-DBL-NEXT: [[ADD:%.*]] = fadd double [[MUL]], [[MUL4]]			// CHECK-EXT-DBL-NEXT: [[ADD:%.*]] = fadd double [[MUL]], [[MUL4]]
	// CHECK-EXT-DBL-NEXT: [[CONV5:%.*]] = fptrunc double [[ADD]] to half			// CHECK-EXT-DBL-NEXT: [[CONV5:%.*]] = fptrunc double [[ADD]] to half
	// CHECK-EXT-DBL-NEXT: ret half [[CONV5]]			// CHECK-EXT-DBL-NEXT: ret half [[CONV5]]
	//			//
	// CHECK-EXT-FP80-LABEL: @f(			// CHECK-EXT-FP80-LABEL: define dso_local half @f
				// CHECK-EXT-FP80-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
	// CHECK-EXT-FP80-NEXT: entry:			// CHECK-EXT-FP80-NEXT: entry:
	// CHECK-EXT-FP80-NEXT: [[A_ADDR:%.*]] = alloca half			// CHECK-EXT-FP80-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-FP80-NEXT: [[B_ADDR:%.*]] = alloca half			// CHECK-EXT-FP80-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-FP80-NEXT: [[C_ADDR:%.*]] = alloca half			// CHECK-EXT-FP80-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-FP80-NEXT: [[D_ADDR:%.*]] = alloca half			// CHECK-EXT-FP80-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
	// CHECK-EXT-FP80-NEXT: store half [[A:%.*]], ptr [[A_ADDR]]			// CHECK-EXT-FP80-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
	// CHECK-EXT-FP80-NEXT: store half [[B:%.*]], ptr [[B_ADDR]]			// CHECK-EXT-FP80-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
	// CHECK-EXT-FP80-NEXT: store half [[C:%.*]], ptr [[C_ADDR]]			// CHECK-EXT-FP80-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
	// CHECK-EXT-FP80-NEXT: store half [[D:%.*]], ptr [[D_ADDR]]			// CHECK-EXT-FP80-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
	// CHECK-EXT-FP80-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]]			// CHECK-EXT-FP80-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
	// CHECK-EXT-FP80-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to x86_fp80			// CHECK-EXT-FP80-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to x86_fp80
	// CHECK-EXT-FP80-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]]			// CHECK-EXT-FP80-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
	// CHECK-EXT-FP80-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to x86_fp80			// CHECK-EXT-FP80-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to x86_fp80
	// CHECK-EXT-FP80-NEXT: [[MUL:%.*]] = fmul x86_fp80 [[CONV]], [[CONV1]]			// CHECK-EXT-FP80-NEXT: [[MUL:%.*]] = fmul x86_fp80 [[CONV]], [[CONV1]]
	// CHECK-EXT-FP80-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]]			// CHECK-EXT-FP80-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
	// CHECK-EXT-FP80-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to x86_fp80			// CHECK-EXT-FP80-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to x86_fp80
	// CHECK-EXT-FP80-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]]			// CHECK-EXT-FP80-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
	// CHECK-EXT-FP80-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to x86_fp80			// CHECK-EXT-FP80-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to x86_fp80
	// CHECK-EXT-FP80-NEXT: [[MUL4:%.*]] = fmul x86_fp80 [[CONV2]], [[CONV3]]			// CHECK-EXT-FP80-NEXT: [[MUL4:%.*]] = fmul x86_fp80 [[CONV2]], [[CONV3]]
	// CHECK-EXT-FP80-NEXT: [[ADD:%.*]] = fadd x86_fp80 [[MUL]], [[MUL4]]			// CHECK-EXT-FP80-NEXT: [[ADD:%.*]] = fadd x86_fp80 [[MUL]], [[MUL4]]
	// CHECK-EXT-FP80-NEXT: [[CONV5:%.*]] = fptrunc x86_fp80 [[ADD]] to half			// CHECK-EXT-FP80-NEXT: [[CONV5:%.*]] = fptrunc x86_fp80 [[ADD]] to half
	// CHECK-EXT-FP80-NEXT: ret half [[CONV5]]			// CHECK-EXT-FP80-NEXT: ret half [[CONV5]]
	//			//
	// CHECK-CONTRACT-LABEL: @f(			// CHECK-CONTRACT-LABEL: define dso_local half @f
				// CHECK-CONTRACT-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
	// CHECK-CONTRACT-NEXT: entry:			// CHECK-CONTRACT-NEXT: entry:
	// CHECK-CONTRACT-NEXT: [[A_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-NEXT: [[B_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-NEXT: [[C_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-NEXT: [[D_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-NEXT: store half [[A:%.*]], ptr [[A_ADDR]]			// CHECK-CONTRACT-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
	// CHECK-CONTRACT-NEXT: store half [[B:%.*]], ptr [[B_ADDR]]			// CHECK-CONTRACT-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
	// CHECK-CONTRACT-NEXT: store half [[C:%.*]], ptr [[C_ADDR]]			// CHECK-CONTRACT-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
	// CHECK-CONTRACT-NEXT: store half [[D:%.*]], ptr [[D_ADDR]]			// CHECK-CONTRACT-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
	// CHECK-CONTRACT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]]			// CHECK-CONTRACT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
	// CHECK-CONTRACT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]]			// CHECK-CONTRACT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
	// CHECK-CONTRACT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]]			// CHECK-CONTRACT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
	// CHECK-CONTRACT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]]			// CHECK-CONTRACT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
	// CHECK-CONTRACT-NEXT: [[MUL1:%.*]] = fmul half [[TMP2]], [[TMP3]]			// CHECK-CONTRACT-NEXT: [[MUL1:%.*]] = fmul half [[TMP2]], [[TMP3]]
	// CHECK-CONTRACT-NEXT: [[TMP4:%.*]] = call half @llvm.fmuladd.f16(half [[TMP0]], half [[TMP1]], half [[MUL1]])			// CHECK-CONTRACT-NEXT: [[TMP4:%.*]] = call half @llvm.fmuladd.f16(half [[TMP0]], half [[TMP1]], half [[MUL1]])
	// CHECK-CONTRACT-NEXT: ret half [[TMP4]]			// CHECK-CONTRACT-NEXT: ret half [[TMP4]]
	//			//
	// CHECK-CONTRACT-DBL-LABEL: @f(			// CHECK-CONTRACT-DBL-LABEL: define dso_local half @f
				// CHECK-CONTRACT-DBL-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
	// CHECK-CONTRACT-DBL-NEXT: entry:			// CHECK-CONTRACT-DBL-NEXT: entry:
	// CHECK-CONTRACT-DBL-NEXT: [[A_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-DBL-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-DBL-NEXT: [[B_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-DBL-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-DBL-NEXT: [[C_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-DBL-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-DBL-NEXT: [[D_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-DBL-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-DBL-NEXT: store half [[A:%.*]], ptr [[A_ADDR]]			// CHECK-CONTRACT-DBL-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
	// CHECK-CONTRACT-DBL-NEXT: store half [[B:%.*]], ptr [[B_ADDR]]			// CHECK-CONTRACT-DBL-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
	// CHECK-CONTRACT-DBL-NEXT: store half [[C:%.*]], ptr [[C_ADDR]]			// CHECK-CONTRACT-DBL-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
	// CHECK-CONTRACT-DBL-NEXT: store half [[D:%.*]], ptr [[D_ADDR]]			// CHECK-CONTRACT-DBL-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
	// CHECK-CONTRACT-DBL-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]]			// CHECK-CONTRACT-DBL-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
	// CHECK-CONTRACT-DBL-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to double			// CHECK-CONTRACT-DBL-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to double
	// CHECK-CONTRACT-DBL-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]]			// CHECK-CONTRACT-DBL-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
	// CHECK-CONTRACT-DBL-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to double			// CHECK-CONTRACT-DBL-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to double
	// CHECK-CONTRACT-DBL-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]]			// CHECK-CONTRACT-DBL-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
	// CHECK-CONTRACT-DBL-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to double			// CHECK-CONTRACT-DBL-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to double
	// CHECK-CONTRACT-DBL-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]]			// CHECK-CONTRACT-DBL-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
	// CHECK-CONTRACT-DBL-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to double			// CHECK-CONTRACT-DBL-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to double
	// CHECK-CONTRACT-DBL-NEXT: [[MUL4:%.*]] = fmul double [[CONV2]], [[CONV3]]			// CHECK-CONTRACT-DBL-NEXT: [[MUL4:%.*]] = fmul double [[CONV2]], [[CONV3]]
	// CHECK-CONTRACT-DBL-NEXT: [[TMP4:%.*]] = call double @llvm.fmuladd.f64(double [[CONV]], double [[CONV1]], double [[MUL4]])			// CHECK-CONTRACT-DBL-NEXT: [[TMP4:%.*]] = call double @llvm.fmuladd.f64(double [[CONV]], double [[CONV1]], double [[MUL4]])
	// CHECK-CONTRACT-DBL-NEXT: [[CONV5:%.*]] = fptrunc double [[TMP4]] to half			// CHECK-CONTRACT-DBL-NEXT: [[CONV5:%.*]] = fptrunc double [[TMP4]] to half
	// CHECK-CONTRACT-DBL-NEXT: ret half [[CONV5]]			// CHECK-CONTRACT-DBL-NEXT: ret half [[CONV5]]
	//			//
	// CHECK-CONTRACT-EXT-LABEL: @f(			// CHECK-CONTRACT-EXT-LABEL: define dso_local half @f
				// CHECK-CONTRACT-EXT-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
	// CHECK-CONTRACT-EXT-NEXT: entry:			// CHECK-CONTRACT-EXT-NEXT: entry:
	// CHECK-CONTRACT-EXT-NEXT: [[A_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-EXT-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-EXT-NEXT: [[B_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-EXT-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-EXT-NEXT: [[C_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-EXT-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-EXT-NEXT: [[D_ADDR:%.*]] = alloca half			// CHECK-CONTRACT-EXT-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
	// CHECK-CONTRACT-EXT-NEXT: store half [[A:%.*]], ptr [[A_ADDR]]			// CHECK-CONTRACT-EXT-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
	// CHECK-CONTRACT-EXT-NEXT: store half [[B:%.*]], ptr [[B_ADDR]]			// CHECK-CONTRACT-EXT-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
	// CHECK-CONTRACT-EXT-NEXT: store half [[C:%.*]], ptr [[C_ADDR]]			// CHECK-CONTRACT-EXT-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
	// CHECK-CONTRACT-EXT-NEXT: store half [[D:%.*]], ptr [[D_ADDR]]			// CHECK-CONTRACT-EXT-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
	// CHECK-CONTRACT-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]]			// CHECK-CONTRACT-EXT-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
	// CHECK-CONTRACT-EXT-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to x86_fp80			// CHECK-CONTRACT-EXT-NEXT: [[CONV:%.*]] = fpext half [[TMP0]] to x86_fp80
	// CHECK-CONTRACT-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]]			// CHECK-CONTRACT-EXT-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
	// CHECK-CONTRACT-EXT-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to x86_fp80			// CHECK-CONTRACT-EXT-NEXT: [[CONV1:%.*]] = fpext half [[TMP1]] to x86_fp80
	// CHECK-CONTRACT-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]]			// CHECK-CONTRACT-EXT-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
	// CHECK-CONTRACT-EXT-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to x86_fp80			// CHECK-CONTRACT-EXT-NEXT: [[CONV2:%.*]] = fpext half [[TMP2]] to x86_fp80
	// CHECK-CONTRACT-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]]			// CHECK-CONTRACT-EXT-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
	// CHECK-CONTRACT-EXT-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to x86_fp80			// CHECK-CONTRACT-EXT-NEXT: [[CONV3:%.*]] = fpext half [[TMP3]] to x86_fp80
	// CHECK-CONTRACT-EXT-NEXT: [[MUL4:%.*]] = fmul x86_fp80 [[CONV2]], [[CONV3]]			// CHECK-CONTRACT-EXT-NEXT: [[MUL4:%.*]] = fmul x86_fp80 [[CONV2]], [[CONV3]]
	// CHECK-CONTRACT-EXT-NEXT: [[TMP4:%.*]] = call x86_fp80 @llvm.fmuladd.f80(x86_fp80 [[CONV]], x86_fp80 [[CONV1]], x86_fp80 [[MUL4]])			// CHECK-CONTRACT-EXT-NEXT: [[TMP4:%.*]] = call x86_fp80 @llvm.fmuladd.f80(x86_fp80 [[CONV]], x86_fp80 [[CONV1]], x86_fp80 [[MUL4]])
	// CHECK-CONTRACT-EXT-NEXT: [[CONV5:%.*]] = fptrunc x86_fp80 [[TMP4]] to half			// CHECK-CONTRACT-EXT-NEXT: [[CONV5:%.*]] = fptrunc x86_fp80 [[TMP4]] to half
	// CHECK-CONTRACT-EXT-NEXT: ret half [[CONV5]]			// CHECK-CONTRACT-EXT-NEXT: ret half [[CONV5]]
	//			//
	// CHECK-UNSAFE-LABEL: @f(			// CHECK-UNSAFE-LABEL: define dso_local half @f
				// CHECK-UNSAFE-SAME: (half noundef [[A:%.]], half noundef [[B:%.]], half noundef [[C:%.]], half noundef [[D:%.]]) #[[ATTR0:[0-9]+]] {
	// CHECK-UNSAFE-NEXT: entry:			// CHECK-UNSAFE-NEXT: entry:
	// CHECK-UNSAFE-NEXT: [[A_ADDR:%.*]] = alloca half			// CHECK-UNSAFE-NEXT: [[A_ADDR:%.*]] = alloca half, align 2
	// CHECK-UNSAFE-NEXT: [[B_ADDR:%.*]] = alloca half			// CHECK-UNSAFE-NEXT: [[B_ADDR:%.*]] = alloca half, align 2
	// CHECK-UNSAFE-NEXT: [[C_ADDR:%.*]] = alloca half			// CHECK-UNSAFE-NEXT: [[C_ADDR:%.*]] = alloca half, align 2
	// CHECK-UNSAFE-NEXT: [[D_ADDR:%.*]] = alloca half			// CHECK-UNSAFE-NEXT: [[D_ADDR:%.*]] = alloca half, align 2
	// CHECK-UNSAFE-NEXT: store half [[A:%.*]], ptr [[A_ADDR]]			// CHECK-UNSAFE-NEXT: store half [[A]], ptr [[A_ADDR]], align 2
	// CHECK-UNSAFE-NEXT: store half [[B:%.*]], ptr [[B_ADDR]]			// CHECK-UNSAFE-NEXT: store half [[B]], ptr [[B_ADDR]], align 2
	// CHECK-UNSAFE-NEXT: store half [[C:%.*]], ptr [[C_ADDR]]			// CHECK-UNSAFE-NEXT: store half [[C]], ptr [[C_ADDR]], align 2
	// CHECK-UNSAFE-NEXT: store half [[D:%.*]], ptr [[D_ADDR]]			// CHECK-UNSAFE-NEXT: store half [[D]], ptr [[D_ADDR]], align 2
	// CHECK-UNSAFE-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]]			// CHECK-UNSAFE-NEXT: [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
	// CHECK-UNSAFE-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]]			// CHECK-UNSAFE-NEXT: [[TMP1:%.*]] = load half, ptr [[B_ADDR]], align 2
	// CHECK-UNSAFE-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]]			// CHECK-UNSAFE-NEXT: [[TMP2:%.*]] = load half, ptr [[C_ADDR]], align 2
	// CHECK-UNSAFE-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]]			// CHECK-UNSAFE-NEXT: [[TMP3:%.*]] = load half, ptr [[D_ADDR]], align 2
	// CHECK-UNSAFE-NEXT: [[MUL1:%.*]] = fmul reassoc nsz arcp afn half [[TMP2]], [[TMP3]]			// CHECK-UNSAFE-NEXT: [[MUL1:%.*]] = fmul reassoc nsz arcp afn half [[TMP2]], [[TMP3]]
	// CHECK-UNSAFE-NEXT: [[TMP4:%.*]] = call reassoc nsz arcp afn half @llvm.fmuladd.f16(half [[TMP0]], half [[TMP1]], half [[MUL1]])			// CHECK-UNSAFE-NEXT: [[TMP4:%.*]] = call reassoc nsz arcp afn half @llvm.fmuladd.f16(half [[TMP0]], half [[TMP1]], half [[MUL1]])
	// CHECK-UNSAFE-NEXT: ret half [[TMP4]]			// CHECK-UNSAFE-NEXT: ret half [[TMP4]]
	//			//
	_Float16 f(_Float16 a, _Float16 b, _Float16 c, _Float16 d) {			_Float16 f(_Float16 a, _Float16 b, _Float16 c, _Float16 d) {
	return a * b + c * d;			return a * b + c * d;
	}			}

	// CHECK-EXT-LABEL: @getFEM(			// CHECK-EXT-LABEL: define dso_local i32 @getFEM
				// CHECK-EXT-SAME: () #[[ATTR0]] {
	// CHECK-EXT-NEXT: entry:			// CHECK-EXT-NEXT: entry:
	// CHECK-EXT-NEXT: ret i32 0			// CHECK-EXT-NEXT: ret i32 0
	//			//
	// CHECK-NO-EXT-LABEL: @getFEM(			// CHECK-NO-EXT-LABEL: define dso_local i32 @getFEM
				// CHECK-NO-EXT-SAME: () #[[ATTR0]] {
	// CHECK-NO-EXT-NEXT: entry:			// CHECK-NO-EXT-NEXT: entry:
	// CHECK-NO-EXT-NEXT: ret i32 0			// CHECK-NO-EXT-NEXT: ret i32 0
	//			//
	// CHECK-EXT-DBL-LABEL: @getFEM(			// CHECK-EXT-DBL-LABEL: define dso_local i32 @getFEM
				// CHECK-EXT-DBL-SAME: () #[[ATTR0]] {
	// CHECK-EXT-DBL-NEXT: entry:			// CHECK-EXT-DBL-NEXT: entry:
	// CHECK-EXT-DBL-NEXT: ret i32 1			// CHECK-EXT-DBL-NEXT: ret i32 1
	//			//
	// CHECK-EXT-FP80-LABEL: @getFEM(			// CHECK-EXT-FP80-LABEL: define dso_local i32 @getFEM
				// CHECK-EXT-FP80-SAME: () #[[ATTR0]] {
	// CHECK-EXT-FP80-NEXT: entry:			// CHECK-EXT-FP80-NEXT: entry:
	// CHECK-EXT-FP80-NEXT: ret i32 2			// CHECK-EXT-FP80-NEXT: ret i32 2
	//			//
	// CHECK-CONTRACT-LABEL: @getFEM(			// CHECK-CONTRACT-LABEL: define dso_local i32 @getFEM
				// CHECK-CONTRACT-SAME: () #[[ATTR0]] {
	// CHECK-CONTRACT-NEXT: entry:			// CHECK-CONTRACT-NEXT: entry:
	// CHECK-CONTRACT-NEXT: ret i32 0			// CHECK-CONTRACT-NEXT: ret i32 0
	//			//
	// CHECK-CONTRACT-DBL-LABEL: @getFEM(			// CHECK-CONTRACT-DBL-LABEL: define dso_local i32 @getFEM
				// CHECK-CONTRACT-DBL-SAME: () #[[ATTR0]] {
	// CHECK-CONTRACT-DBL-NEXT: entry:			// CHECK-CONTRACT-DBL-NEXT: entry:
	// CHECK-CONTRACT-DBL-NEXT: ret i32 1			// CHECK-CONTRACT-DBL-NEXT: ret i32 1
	//			//
	// CHECK-CONTRACT-EXT-LABEL: @getFEM(			// CHECK-CONTRACT-EXT-LABEL: define dso_local i32 @getFEM
				// CHECK-CONTRACT-EXT-SAME: () #[[ATTR0]] {
	// CHECK-CONTRACT-EXT-NEXT: entry:			// CHECK-CONTRACT-EXT-NEXT: entry:
	// CHECK-CONTRACT-EXT-NEXT: ret i32 2			// CHECK-CONTRACT-EXT-NEXT: ret i32 2
	//			//
	// CHECK-UNSAFE-LABEL: @getFEM(			// CHECK-UNSAFE-LABEL: define dso_local i32 @getFEM
				// CHECK-UNSAFE-SAME: () #[[ATTR0]] {
	// CHECK-UNSAFE-NEXT: entry:			// CHECK-UNSAFE-NEXT: entry:
	// CHECK-UNSAFE-NEXT: ret i32 0			// CHECK-UNSAFE-NEXT: ret i32 0
	//			//
	int getFEM() {			int getFEM() {
	return __FLT_EVAL_METHOD__;			return __FLT_EVAL_METHOD__;
	}			}

clang/test/CodeGen/X86/fma-intrinsics.c

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
				// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu nehalem -emit-llvm -o - %s \| FileCheck -check-prefixes=CHECK-NO-FMA %s
				// RUN: %clang_cc1 -triple x86_64-linux-gnu -target-cpu skylake -emit-llvm -o - %s \| FileCheck -check-prefixes=CHECK-FMA %s

				// CHECK-NO-FMA-LABEL: define dso_local float @testFma
				// CHECK-NO-FMA-SAME: (float noundef [[ADD:%.]], float noundef [[MUL:%.]], float noundef [[NUM:%.*]]) #[[ATTR0:[0-9]+]] {
				// CHECK-NO-FMA-NEXT: entry:
				// CHECK-NO-FMA-NEXT: [[ADD_ADDR:%.*]] = alloca float, align 4
				// CHECK-NO-FMA-NEXT: [[MUL_ADDR:%.*]] = alloca float, align 4
				// CHECK-NO-FMA-NEXT: [[NUM_ADDR:%.*]] = alloca float, align 4
				// CHECK-NO-FMA-NEXT: store float [[ADD]], ptr [[ADD_ADDR]], align 4
				// CHECK-NO-FMA-NEXT: store float [[MUL]], ptr [[MUL_ADDR]], align 4
				// CHECK-NO-FMA-NEXT: store float [[NUM]], ptr [[NUM_ADDR]], align 4
				// CHECK-NO-FMA-NEXT: [[TMP0:%.*]] = load float, ptr [[ADD_ADDR]], align 4
				// CHECK-NO-FMA-NEXT: [[TMP1:%.*]] = load float, ptr [[NUM_ADDR]], align 4
				// CHECK-NO-FMA-NEXT: [[TMP2:%.*]] = load float, ptr [[MUL_ADDR]], align 4
				// CHECK-NO-FMA-NEXT: [[MUL1:%.*]] = fmul float [[TMP1]], [[TMP2]]
				// CHECK-NO-FMA-NEXT: [[ADD2:%.*]] = fadd float [[TMP0]], [[MUL1]]
				// CHECK-NO-FMA-NEXT: ret float [[ADD2]]
				//
				// CHECK-FMA-LABEL: define dso_local float @testFma
				// CHECK-FMA-SAME: (float noundef [[ADD:%.]], float noundef [[MUL:%.]], float noundef [[NUM:%.*]]) #[[ATTR0:[0-9]+]] {
				// CHECK-FMA-NEXT: entry:
				// CHECK-FMA-NEXT: [[ADD_ADDR:%.*]] = alloca float, align 4
				// CHECK-FMA-NEXT: [[MUL_ADDR:%.*]] = alloca float, align 4
				// CHECK-FMA-NEXT: [[NUM_ADDR:%.*]] = alloca float, align 4
				// CHECK-FMA-NEXT: store float [[ADD]], ptr [[ADD_ADDR]], align 4
				// CHECK-FMA-NEXT: store float [[MUL]], ptr [[MUL_ADDR]], align 4
				// CHECK-FMA-NEXT: store float [[NUM]], ptr [[NUM_ADDR]], align 4
				// CHECK-FMA-NEXT: [[TMP0:%.*]] = load float, ptr [[ADD_ADDR]], align 4
				// CHECK-FMA-NEXT: [[TMP1:%.*]] = load float, ptr [[NUM_ADDR]], align 4
				// CHECK-FMA-NEXT: [[TMP2:%.*]] = load float, ptr [[MUL_ADDR]], align 4
				// CHECK-FMA-NEXT: [[TMP3:%.*]] = call float @llvm.fmuladd.f32(float [[TMP1]], float [[TMP2]], float [[TMP0]])
				// CHECK-FMA-NEXT: ret float [[TMP3]]
				//
				float testFma(float add, float mul, float num) {
				return add + num*mul;
				}