This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/AST/
-
clang/
-
AST/
-
Stmt.h
-
lib/
-
CodeGen/
-
CGStmt.cpp
-
Sema/
-
SemaTemplateInstantiateDecl.cpp
-
test/CodeGen/
-
CodeGen/
-
X86/
-
avx512dq-builtins-constrained.c
-
aarch64-v8.2a-neon-intrinsics-constrained.c
-
complex-strictfp.c
1/2
pragma-fenv_access.cpp

Differential D129464

[Clang][CodeGen] Set FP options of builder at entry to compound statement
AbandonedPublic

Authored by sepavloff on Jul 10 2022, 11:49 PM.

Download Raw Diff

Details

Reviewers

kpn
rjmccall
aaron.ballman

Summary

Previously compilation of a few tests produced incorrect code. In them FP
pragmas were ignored and the resulting code contained constrained intrinsics
with rounding mode and/or exception behavior determined by command line
options only. Compiler creates correct AST for this tests, but FP options
were ignored in code generator because builder object was not set up properly.
To fix code generation builder object now is updated according to FP options
stored in CompoundStmt.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sepavloff created this revision.Jul 10 2022, 11:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2022, 11:49 PM

Herald added subscribers: jsji, pengfei. · View Herald Transcript

sepavloff requested review of this revision.Jul 10 2022, 11:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2022, 11:49 PM

Harbormaster completed remote builds in B174610: Diff 443564.Jul 11 2022, 12:37 AM

Thank you for looking into this! I happened to run into this same issue with #pragma float_control not behaving the way I'd expect. (FWIW, we also ran into an interesting issue where the floating point options were pushed but never popped in the TU were delayed template instantiation behaved differently than typical template instantiation.)

clang/test/CodeGen/pragma-fenv_access.cpp
36	There are some extra test cases I'd like to see coverage for because there are some interesting edge cases to consider. template <typename Ty> float func1(Ty) { float f1 = 1.0f, f2 = 3.0f; return f1 + f2 * 2.0f; } #pragma float_control(precise, on, push) template float func1<int>(int); #pragma float_control(pop) #pragma float_control(precise, on, push) template <typename Ty> float func2(Ty) { float f1 = 1.0f, f2 = 3.0f; return f1 + f2 * 2.0f; } #pragma float_control(pop) template float func2<int>(int); void bar() { func1(1.1); func2(1.1); } This gets especially interesting when you think about delayed template instantiation as happens by default on Windows targets. Consider this code with the driver level `-ffast-math` flag enabled (not the cc1 option, which is different). I think that `func1<int>` SHOULD be precise, because the explicit instantiation is, while `func1<double>` SHOULD NOT be precise, because the definition is not. `func2<int>` SHOULD NOT be precise, because the explicit instantiation is not, while `func2<double>` SHOULD be precise, because the definition is. Partial specializations are a similar situation where the primary template and its related code made have different options. WDYT?

sepavloff added a child revision: D125625: Implementation of '#pragma STDC FENV_ROUND'.Jul 11 2022, 5:20 AM

sepavloff added inline comments.Jul 15 2022, 1:04 AM

clang/test/CodeGen/pragma-fenv_access.cpp
36	Standard FP pragmas are defined only in C standard, so interaction of them with C++ specific features is actually implementation-defined. The cases presented in your example are reasonable solutions with one exception: IMO `func2<int>` should be precise, because its template is precise. It is equivalent to: template <typename Ty> float func2(Ty) { #pragma float_control(precise, on) float f1 = 1.0f, f2 = 3.0f; return f1 + f2 * 2.0f; } so instantiation of it would produce function with precise operations. Implementation of correct mechanism of the interaction requires substantial efforts and should be made in a separate patch, I think. In particular, we need to invent a way to associate a point of instantiation with the FPOptions in that point, so that delayed instantiation could be made with correct set of options. In this patch the change in SemaTemplateInstantiateDecl.cpp prevents from compiler crash. Without it codegen tries to create a call to constrained intrinsic in the function that do not have attribute StrictFP, because flag FEnvAccess is set at the end of translation unit in `pragma-fenv_access.cpp`.

This property adheres to a function definition, so it seems to me that an explicit *instantiation* ought to preserve it from the instantiated template definition, but an explicit *specialization* ought to be independent.

i.e.

#pragma float_control(precise, on, push)
template <typename Ty>
float func2(Ty) {
  float f1 = 1.0f, f2 = 3.0f;
  return f1 + f2 * 2.0f;
}
#pragma float_control(pop)

template float func2<int>(int); // precise
template <> float func2<long>(long) { ... } // not precise

sepavloff added a child revision: D131143: [Clang] Interaction of FP pragmas and function template instantiation.Aug 3 2022, 11:31 PM

sepavloff edited the summary of this revision. (Show Details)

Interaction of pragmas and templates is implemented in https://reviews.llvm.org/D131143.

sepavloff mentioned this in D131143: [Clang] Interaction of FP pragmas and function template instantiation.Aug 5 2022, 9:37 AM

As specific interaction of FP pragmas and template instantiation was not supported, this patch is actual without additional changes.

Rebase and ping

Harbormaster completed remote builds in B182837: Diff 454822.Aug 23 2022, 7:33 AM

After implementation of D142001 and D143241 this patch is not neded anymore.

sepavloff abandoned this revision.Jul 15 2023, 9:31 PM

Revision Contents

Path

Size

clang/

include/

clang/

AST/

Stmt.h

9 lines

lib/

CodeGen/

CGStmt.cpp

2 lines

Sema/

SemaTemplateInstantiateDecl.cpp

3 lines

test/

CodeGen/

X86/

avx512dq-builtins-constrained.c

13 lines

aarch64-v8.2a-neon-intrinsics-constrained.c

45 lines

complex-strictfp.c

6 lines

pragma-fenv_access.cpp

42 lines

Diff 454822

clang/include/clang/AST/Stmt.h

Show First 20 Lines • Show All 1,452 Lines • ▼ Show 20 Lines	public:
bool hasStoredFPFeatures() const { return CompoundStmtBits.HasFPFeatures; }		bool hasStoredFPFeatures() const { return CompoundStmtBits.HasFPFeatures; }

/// Get FPOptionsOverride from trailing storage.		/// Get FPOptionsOverride from trailing storage.
FPOptionsOverride getStoredFPFeatures() const {		FPOptionsOverride getStoredFPFeatures() const {
assert(hasStoredFPFeatures());		assert(hasStoredFPFeatures());
return *getTrailingObjects<FPOptionsOverride>();		return *getTrailingObjects<FPOptionsOverride>();
}		}

		/// Get FPOptions inside this statement. They may differ from outer options
		/// due to pragmas.
		/// \param CurFPOptions FPOptions outside this statement.
		FPOptions getNewFPOptions(FPOptions CurFPOptions) const {
		return hasStoredFPFeatures()
		? getStoredFPFeatures().applyOverrides(CurFPOptions)
		: CurFPOptions;
		}

using body_iterator = Stmt **;		using body_iterator = Stmt **;
using body_range = llvm::iterator_range<body_iterator>;		using body_range = llvm::iterator_range<body_iterator>;

body_range body() { return body_range(body_begin(), body_end()); }		body_range body() { return body_range(body_begin(), body_end()); }
body_iterator body_begin() { return getTrailingObjects<Stmt *>(); }		body_iterator body_begin() { return getTrailingObjects<Stmt *>(); }
body_iterator body_end() { return body_begin() + size(); }		body_iterator body_end() { return body_begin() + size(); }
Stmt *body_front() { return !body_empty() ? body_begin()[0] : nullptr; }		Stmt *body_front() { return !body_empty() ? body_begin()[0] : nullptr; }

▲ Show 20 Lines • Show All 2,279 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGStmt.cpp

	Show First 20 Lines • Show All 485 Lines • ▼ Show 20 Lines
	CodeGenFunction::EmitCompoundStmtWithoutScope(const CompoundStmt &S,			CodeGenFunction::EmitCompoundStmtWithoutScope(const CompoundStmt &S,
	bool GetLast,			bool GetLast,
	AggValueSlot AggSlot) {			AggValueSlot AggSlot) {

	const Stmt *ExprResult = S.getStmtExprResult();			const Stmt *ExprResult = S.getStmtExprResult();
	assert((!GetLast \|\| (GetLast && ExprResult)) &&			assert((!GetLast \|\| (GetLast && ExprResult)) &&
	"If GetLast is true then the CompoundStmt must have a StmtExprResult");			"If GetLast is true then the CompoundStmt must have a StmtExprResult");

				CGFPOptionsRAII SavedFPFeatues(*this, S.getNewFPOptions(CurFPFeatures));

	Address RetAlloca = Address::invalid();			Address RetAlloca = Address::invalid();

	for (auto *CurStmt : S.body()) {			for (auto *CurStmt : S.body()) {
	if (GetLast && ExprResult == CurStmt) {			if (GetLast && ExprResult == CurStmt) {
	// We have to special case labels here. They are statements, but when put			// We have to special case labels here. They are statements, but when put
	// at the end of a statement expression, they yield the value of their			// at the end of a statement expression, they yield the value of their
	// subexpression. Handle this by walking through all labels we encounter,			// subexpression. Handle this by walking through all labels we encounter,
	// emitting them before we evaluate the subexpr.			// emitting them before we evaluate the subexpr.
	▲ Show 20 Lines • Show All 2,393 Lines • Show Last 20 Lines

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp

Show First 20 Lines • Show All 5,045 Lines • ▼ Show 20 Lines	if (PatternDecl->isDefaulted()) {
SubstQualifier(*this, PatternDecl, Function, TemplateArgs);		SubstQualifier(*this, PatternDecl, Function, TemplateArgs);

ActOnStartOfFunctionDef(nullptr, Function);		ActOnStartOfFunctionDef(nullptr, Function);

// Enter the scope of this instantiation. We don't use		// Enter the scope of this instantiation. We don't use
// PushDeclContext because we don't have a scope.		// PushDeclContext because we don't have a scope.
Sema::ContextRAII savedContext(*this, Function);		Sema::ContextRAII savedContext(*this, Function);

		FPFeaturesStateRAII SavedFPFeatures(*this);
		CurFPFeatures = FPOptions(getLangOpts());

if (addInstantiatedParametersToScope(Function, PatternDecl, Scope,		if (addInstantiatedParametersToScope(Function, PatternDecl, Scope,
TemplateArgs))		TemplateArgs))
return;		return;

StmtResult Body;		StmtResult Body;
if (PatternDecl->hasSkippedBody()) {		if (PatternDecl->hasSkippedBody()) {
ActOnSkippedFunctionBody(Function);		ActOnSkippedFunctionBody(Function);
Body = nullptr;		Body = nullptr;
▲ Show 20 Lines • Show All 1,342 Lines • Show Last 20 Lines

clang/test/CodeGen/X86/avx512dq-builtins-constrained.c

// REQUIRES: x86-registered-target		// REQUIRES: x86-registered-target
// RUN: %clang_cc1 -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-apple-darwin -target-feature +avx512dq -emit-llvm -o - -Wall -Werror \| FileCheck %s --check-prefix=UNCONSTRAINED --check-prefix=COMMON --check-prefix=COMMONIR		// RUN: %clang_cc1 -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-apple-darwin -target-feature +avx512dq -emit-llvm -o - -Wall -Werror \| FileCheck %s --check-prefix=UNCONSTRAINED --check-prefix=COMMON --check-prefix=COMMONIR
// RUN: %clang_cc1 -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-apple-darwin -target-feature +avx512dq -ffp-exception-behavior=maytrap -DSTRICT=1 -emit-llvm -o - -Wall -Werror \| FileCheck %s --check-prefix=CONSTRAINED --check-prefix=COMMON --check-prefix=COMMONIR		// RUN: %clang_cc1 -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-apple-darwin -target-feature +avx512dq -ffp-exception-behavior=maytrap -DSTRICT=1 -emit-llvm -o - -Wall -Werror \| FileCheck %s --check-prefix=CONSTRAINED --check-prefix=COMMON --check-prefix=COMMONIR
// RUN: %clang_cc1 -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-apple-darwin -target-feature +avx512dq -S -o - -Wall -Werror \| FileCheck %s --check-prefix=CHECK-ASM --check-prefix=COMMON		// RUN: %clang_cc1 -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-apple-darwin -target-feature +avx512dq -S -o - -Wall -Werror \| FileCheck %s --check-prefix=CHECK-ASM --check-prefix=COMMON
// RUN: %clang_cc1 -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-apple-darwin -target-feature +avx512dq -ffp-exception-behavior=maytrap -DSTRICT=1 -S -o - -Wall -Werror \| FileCheck %s --check-prefix=CHECK-ASM --check-prefix=COMMON		// RUN: %clang_cc1 -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-apple-darwin -target-feature +avx512dq -ffp-exception-behavior=maytrap -DSTRICT=1 -S -o - -Wall -Werror \| FileCheck %s --check-prefix=CHECK-ASM --check-prefix=COMMON

// FIXME: Every instance of "fpexcept.maytrap" is wrong.
#ifdef STRICT		#ifdef STRICT
// Test that the constrained intrinsics are picking up the exception		// Test that the constrained intrinsics are picking up the exception
// metadata from the AST instead of the global default from the command line.		// metadata from the AST instead of the global default from the command line.

#pragma float_control(except, on)		#pragma float_control(except, on)
#endif		#endif


#include <immintrin.h>		#include <immintrin.h>

__m512d test_mm512_cvtepi64_pd(__m512i __A) {		__m512d test_mm512_cvtepi64_pd(__m512i __A) {
// COMMON-LABEL: test_mm512_cvtepi64_pd		// COMMON-LABEL: test_mm512_cvtepi64_pd
// UNCONSTRAINED: sitofp <8 x i64> %{{.*}} to <8 x double>		// UNCONSTRAINED: sitofp <8 x i64> %{{.*}} to <8 x double>
// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.sitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.maytrap")		// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.sitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
// CHECK-ASM: vcvtqq2pd		// CHECK-ASM: vcvtqq2pd
return _mm512_cvtepi64_pd(__A);		return _mm512_cvtepi64_pd(__A);
}		}

__m512d test_mm512_mask_cvtepi64_pd(__m512d __W, __mmask8 __U, __m512i __A) {		__m512d test_mm512_mask_cvtepi64_pd(__m512d __W, __mmask8 __U, __m512i __A) {
// COMMON-LABEL: test_mm512_mask_cvtepi64_pd		// COMMON-LABEL: test_mm512_mask_cvtepi64_pd
// UNCONSTRAINED: sitofp <8 x i64> %{{.*}} to <8 x double>		// UNCONSTRAINED: sitofp <8 x i64> %{{.*}} to <8 x double>
// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.sitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.maytrap")		// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.sitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
// COMMONIR: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// COMMONIR: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
// CHECK-ASM: vcvtqq2pd		// CHECK-ASM: vcvtqq2pd
return _mm512_mask_cvtepi64_pd(__W, __U, __A);		return _mm512_mask_cvtepi64_pd(__W, __U, __A);
}		}

__m512d test_mm512_maskz_cvtepi64_pd(__mmask8 __U, __m512i __A) {		__m512d test_mm512_maskz_cvtepi64_pd(__mmask8 __U, __m512i __A) {
// COMMON-LABEL: test_mm512_maskz_cvtepi64_pd		// COMMON-LABEL: test_mm512_maskz_cvtepi64_pd
// UNCONSTRAINED: sitofp <8 x i64> %{{.*}} to <8 x double>		// UNCONSTRAINED: sitofp <8 x i64> %{{.*}} to <8 x double>
// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.sitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.maytrap")		// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.sitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
// COMMONIR: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// COMMONIR: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
// CHECK-ASM: vcvtqq2pd		// CHECK-ASM: vcvtqq2pd
return _mm512_maskz_cvtepi64_pd(__U, __A);		return _mm512_maskz_cvtepi64_pd(__U, __A);
}		}

__m512d test_mm512_cvt_roundepi64_pd(__m512i __A) {		__m512d test_mm512_cvt_roundepi64_pd(__m512i __A) {
// COMMON-LABEL: test_mm512_cvt_roundepi64_pd		// COMMON-LABEL: test_mm512_cvt_roundepi64_pd
// COMMONIR: @llvm.x86.avx512.sitofp.round.v8f64.v8i64		// COMMONIR: @llvm.x86.avx512.sitofp.round.v8f64.v8i64
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	__m256 test_mm512_maskz_cvt_roundepi64_ps(__mmask8 __U, __m512i __A) {
// COMMONIR: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// COMMONIR: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
// CHECK-ASM: vcvtqq2ps		// CHECK-ASM: vcvtqq2ps
return _mm512_maskz_cvt_roundepi64_ps(__U, __A, _MM_FROUND_TO_NEAREST_INT \| _MM_FROUND_NO_EXC);		return _mm512_maskz_cvt_roundepi64_ps(__U, __A, _MM_FROUND_TO_NEAREST_INT \| _MM_FROUND_NO_EXC);
}		}

__m512d test_mm512_cvtepu64_pd(__m512i __A) {		__m512d test_mm512_cvtepu64_pd(__m512i __A) {
// COMMON-LABEL: test_mm512_cvtepu64_pd		// COMMON-LABEL: test_mm512_cvtepu64_pd
// UNCONSTRAINED: uitofp <8 x i64> %{{.*}} to <8 x double>		// UNCONSTRAINED: uitofp <8 x i64> %{{.*}} to <8 x double>
// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.uitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.maytrap")		// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.uitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
// CHECK-ASM: vcvtuqq2pd		// CHECK-ASM: vcvtuqq2pd
return _mm512_cvtepu64_pd(__A);		return _mm512_cvtepu64_pd(__A);
}		}

__m512d test_mm512_mask_cvtepu64_pd(__m512d __W, __mmask8 __U, __m512i __A) {		__m512d test_mm512_mask_cvtepu64_pd(__m512d __W, __mmask8 __U, __m512i __A) {
// COMMON-LABEL: test_mm512_mask_cvtepu64_pd		// COMMON-LABEL: test_mm512_mask_cvtepu64_pd
// UNCONSTRAINED: uitofp <8 x i64> %{{.*}} to <8 x double>		// UNCONSTRAINED: uitofp <8 x i64> %{{.*}} to <8 x double>
// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.uitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.maytrap")		// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.uitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
// COMMONIR: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// COMMONIR: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
// CHECK-ASM: vcvtuqq2pd		// CHECK-ASM: vcvtuqq2pd
return _mm512_mask_cvtepu64_pd(__W, __U, __A);		return _mm512_mask_cvtepu64_pd(__W, __U, __A);
}		}

__m512d test_mm512_maskz_cvtepu64_pd(__mmask8 __U, __m512i __A) {		__m512d test_mm512_maskz_cvtepu64_pd(__mmask8 __U, __m512i __A) {
// COMMON-LABEL: test_mm512_maskz_cvtepu64_pd		// COMMON-LABEL: test_mm512_maskz_cvtepu64_pd
// UNCONSTRAINED: uitofp <8 x i64> %{{.*}} to <8 x double>		// UNCONSTRAINED: uitofp <8 x i64> %{{.*}} to <8 x double>
// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.uitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.maytrap")		// CONSTRAINED: call <8 x double> @llvm.experimental.constrained.uitofp.v8f64.v8i64(<8 x i64> %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict")
// COMMONIR: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// COMMONIR: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
// CHECK-ASM: vcvtuqq2pd		// CHECK-ASM: vcvtuqq2pd
return _mm512_maskz_cvtepu64_pd(__U, __A);		return _mm512_maskz_cvtepu64_pd(__U, __A);
}		}

__m512d test_mm512_cvt_roundepu64_pd(__m512i __A) {		__m512d test_mm512_cvt_roundepu64_pd(__m512i __A) {
// COMMON-LABEL: test_mm512_cvt_roundepu64_pd		// COMMON-LABEL: test_mm512_cvt_roundepu64_pd
// COMMONIR: @llvm.x86.avx512.uitofp.round.v8f64.v8i64		// COMMONIR: @llvm.x86.avx512.uitofp.round.v8f64.v8i64
▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines

clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics-constrained.c

	Show All 17 Lines
	// RUN: -fallow-half-arguments-and-returns -flax-vector-conversions=none -S -disable-O0-optnone -emit-llvm -o - %s \			// RUN: -fallow-half-arguments-and-returns -flax-vector-conversions=none -S -disable-O0-optnone -emit-llvm -o - %s \
	// RUN: \| opt -S -mem2reg \| llc -o=- - \			// RUN: \| opt -S -mem2reg \| llc -o=- - \
	// RUN: \| FileCheck --check-prefix=COMMON --check-prefix=CHECK-ASM %s			// RUN: \| FileCheck --check-prefix=COMMON --check-prefix=CHECK-ASM %s

	// REQUIRES: aarch64-registered-target			// REQUIRES: aarch64-registered-target

	// Test that the constrained intrinsics are picking up the exception			// Test that the constrained intrinsics are picking up the exception
	// metadata from the AST instead of the global default from the command line.			// metadata from the AST instead of the global default from the command line.
	// FIXME: All cases of "fpexcept.maytrap" in this test are wrong.

	#if EXCEPT			#if EXCEPT
	#pragma float_control(except, on)			#pragma float_control(except, on)
	#endif			#endif

	#include <arm_neon.h>			#include <arm_neon.h>

	// COMMON-LABEL: test_vsqrt_f16			// COMMON-LABEL: test_vsqrt_f16
	// UNCONSTRAINED: [[SQR:%.*]] = call <4 x half> @llvm.sqrt.v4f16(<4 x half> %a)			// UNCONSTRAINED: [[SQR:%.*]] = call <4 x half> @llvm.sqrt.v4f16(<4 x half> %a)
	// CONSTRAINED: [[SQR:%.*]] = call <4 x half> @llvm.experimental.constrained.sqrt.v4f16(<4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[SQR:%.*]] = call <4 x half> @llvm.experimental.constrained.sqrt.v4f16(<4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fsqrt v{{[0-9]+}}.4h, v{{[0-9]+}}.4h			// CHECK-ASM: fsqrt v{{[0-9]+}}.4h, v{{[0-9]+}}.4h
	// COMMONIR: ret <4 x half> [[SQR]]			// COMMONIR: ret <4 x half> [[SQR]]
	float16x4_t test_vsqrt_f16(float16x4_t a) {			float16x4_t test_vsqrt_f16(float16x4_t a) {
	return vsqrt_f16(a);			return vsqrt_f16(a);
	}			}

	// COMMON-LABEL: test_vsqrtq_f16			// COMMON-LABEL: test_vsqrtq_f16
	// UNCONSTRAINED: [[SQR:%.*]] = call <8 x half> @llvm.sqrt.v8f16(<8 x half> %a)			// UNCONSTRAINED: [[SQR:%.*]] = call <8 x half> @llvm.sqrt.v8f16(<8 x half> %a)
	// CONSTRAINED: [[SQR:%.*]] = call <8 x half> @llvm.experimental.constrained.sqrt.v8f16(<8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[SQR:%.*]] = call <8 x half> @llvm.experimental.constrained.sqrt.v8f16(<8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fsqrt v{{[0-9]+}}.8h, v{{[0-9]+}}.8h			// CHECK-ASM: fsqrt v{{[0-9]+}}.8h, v{{[0-9]+}}.8h
	// COMMONIR: ret <8 x half> [[SQR]]			// COMMONIR: ret <8 x half> [[SQR]]
	float16x8_t test_vsqrtq_f16(float16x8_t a) {			float16x8_t test_vsqrtq_f16(float16x8_t a) {
	return vsqrtq_f16(a);			return vsqrtq_f16(a);
	}			}

	// COMMON-LABEL: test_vfma_f16			// COMMON-LABEL: test_vfma_f16
	// UNCONSTRAINED: [[ADD:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> %b, <4 x half> %c, <4 x half> %a)			// UNCONSTRAINED: [[ADD:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> %b, <4 x half> %c, <4 x half> %a)
	// CONSTRAINED: [[ADD:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> %b, <4 x half> %c, <4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[ADD:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> %b, <4 x half> %c, <4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.4h			// CHECK-ASM: fmla v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.4h
	// COMMONIR: ret <4 x half> [[ADD]]			// COMMONIR: ret <4 x half> [[ADD]]
	float16x4_t test_vfma_f16(float16x4_t a, float16x4_t b, float16x4_t c) {			float16x4_t test_vfma_f16(float16x4_t a, float16x4_t b, float16x4_t c) {
	return vfma_f16(a, b, c);			return vfma_f16(a, b, c);
	}			}

	// COMMON-LABEL: test_vfmaq_f16			// COMMON-LABEL: test_vfmaq_f16
	// UNCONSTRAINED: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> %b, <8 x half> %c, <8 x half> %a)			// UNCONSTRAINED: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> %b, <8 x half> %c, <8 x half> %a)
	// CONSTRAINED: [[ADD:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> %b, <8 x half> %c, <8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[ADD:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> %b, <8 x half> %c, <8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.8h			// CHECK-ASM: fmla v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.8h
	// COMMONIR: ret <8 x half> [[ADD]]			// COMMONIR: ret <8 x half> [[ADD]]
	float16x8_t test_vfmaq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmaq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmaq_f16(a, b, c);			return vfmaq_f16(a, b, c);
	}			}

	// COMMON-LABEL: test_vfms_f16			// COMMON-LABEL: test_vfms_f16
	// COMMONIR: [[SUB:%.*]] = fneg <4 x half> %b			// COMMONIR: [[SUB:%.*]] = fneg <4 x half> %b
	// UNCONSTRAINED: [[ADD:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> %c, <4 x half> %a)			// UNCONSTRAINED: [[ADD:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> %c, <4 x half> %a)
	// CONSTRAINED: [[ADD:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[SUB]], <4 x half> %c, <4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[ADD:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[SUB]], <4 x half> %c, <4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmls v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.4h			// CHECK-ASM: fmls v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.4h
	// COMMONIR: ret <4 x half> [[ADD]]			// COMMONIR: ret <4 x half> [[ADD]]
	float16x4_t test_vfms_f16(float16x4_t a, float16x4_t b, float16x4_t c) {			float16x4_t test_vfms_f16(float16x4_t a, float16x4_t b, float16x4_t c) {
	return vfms_f16(a, b, c);			return vfms_f16(a, b, c);
	}			}

	// COMMON-LABEL: test_vfmsq_f16			// COMMON-LABEL: test_vfmsq_f16
	// COMMONIR: [[SUB:%.*]] = fneg <8 x half> %b			// COMMONIR: [[SUB:%.*]] = fneg <8 x half> %b
	// UNCONSTRAINED: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> %c, <8 x half> %a)			// UNCONSTRAINED: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> %c, <8 x half> %a)
	// CONSTRAINED: [[ADD:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[SUB]], <8 x half> %c, <8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[ADD:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[SUB]], <8 x half> %c, <8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmls v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.8h			// CHECK-ASM: fmls v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.8h
	// COMMONIR: ret <8 x half> [[ADD]]			// COMMONIR: ret <8 x half> [[ADD]]
	float16x8_t test_vfmsq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmsq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmsq_f16(a, b, c);			return vfmsq_f16(a, b, c);
	}			}

	// COMMON-LABEL: test_vfma_lane_f16			// COMMON-LABEL: test_vfma_lane_f16
	// COMMONIR: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>			// COMMONIR: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>
	// COMMONIR: [[TMP1:%.*]] = bitcast <4 x half> %b to <8 x i8>			// COMMONIR: [[TMP1:%.*]] = bitcast <4 x half> %b to <8 x i8>
	// COMMONIR: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>			// COMMONIR: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>
	// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>			// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>
	// COMMONIR: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>			// COMMONIR: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>
	// COMMONIR: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>			// COMMONIR: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>
	// COMMONIR: [[TMP5:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>			// COMMONIR: [[TMP5:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>
	// UNCONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]])			// UNCONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]])
	// CONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]], metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <4 x half> [[FMLA]]			// COMMONIR: ret <4 x half> [[FMLA]]
	float16x4_t test_vfma_lane_f16(float16x4_t a, float16x4_t b, float16x4_t c) {			float16x4_t test_vfma_lane_f16(float16x4_t a, float16x4_t b, float16x4_t c) {
	return vfma_lane_f16(a, b, c, 3);			return vfma_lane_f16(a, b, c, 3);
	}			}

	// COMMON-LABEL: test_vfmaq_lane_f16			// COMMON-LABEL: test_vfmaq_lane_f16
	// COMMONIR: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>			// COMMONIR: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>
	// COMMONIR: [[TMP1:%.*]] = bitcast <8 x half> %b to <16 x i8>			// COMMONIR: [[TMP1:%.*]] = bitcast <8 x half> %b to <16 x i8>
	// COMMONIR: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>			// COMMONIR: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>
	// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>			// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>
	// COMMONIR: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>			// COMMONIR: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>
	// COMMONIR: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>			// COMMONIR: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>
	// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>			// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>
	// UNCONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]])			// UNCONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]])
	// CONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]], metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <8 x half> [[FMLA]]			// COMMONIR: ret <8 x half> [[FMLA]]
	float16x8_t test_vfmaq_lane_f16(float16x8_t a, float16x8_t b, float16x4_t c) {			float16x8_t test_vfmaq_lane_f16(float16x8_t a, float16x8_t b, float16x4_t c) {
	return vfmaq_lane_f16(a, b, c, 3);			return vfmaq_lane_f16(a, b, c, 3);
	}			}

	// COMMON-LABEL: test_vfma_laneq_f16			// COMMON-LABEL: test_vfma_laneq_f16
	// COMMONIR: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>			// COMMONIR: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>
	// COMMONIR: [[TMP1:%.*]] = bitcast <4 x half> %b to <8 x i8>			// COMMONIR: [[TMP1:%.*]] = bitcast <4 x half> %b to <8 x i8>
	// COMMONIR: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>			// COMMONIR: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>
	// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>			// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>
	// COMMONIR: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>			// COMMONIR: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>
	// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>			// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>
	// COMMONIR: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <4 x i32> <i32 7, i32 7, i32 7, i32 7>			// COMMONIR: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <4 x i32> <i32 7, i32 7, i32 7, i32 7>
	// UNCONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]])			// UNCONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]])
	// CONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]], metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <4 x half> [[FMLA]]			// COMMONIR: ret <4 x half> [[FMLA]]
	float16x4_t test_vfma_laneq_f16(float16x4_t a, float16x4_t b, float16x8_t c) {			float16x4_t test_vfma_laneq_f16(float16x4_t a, float16x4_t b, float16x8_t c) {
	return vfma_laneq_f16(a, b, c, 7);			return vfma_laneq_f16(a, b, c, 7);
	}			}

	// COMMON-LABEL: test_vfmaq_laneq_f16			// COMMON-LABEL: test_vfmaq_laneq_f16
	// COMMONIR: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>			// COMMONIR: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>
	// COMMONIR: [[TMP1:%.*]] = bitcast <8 x half> %b to <16 x i8>			// COMMONIR: [[TMP1:%.*]] = bitcast <8 x half> %b to <16 x i8>
	// COMMONIR: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>			// COMMONIR: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>
	// COMMONIR: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>			// COMMONIR: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>
	// COMMONIR: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>			// COMMONIR: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>
	// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>			// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>
	// COMMONIR: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <8 x i32> <i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7>			// COMMONIR: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <8 x i32> <i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7>
	// UNCONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]])			// UNCONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]])
	// CONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]], metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <8 x half> [[FMLA]]			// COMMONIR: ret <8 x half> [[FMLA]]
	float16x8_t test_vfmaq_laneq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmaq_laneq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmaq_laneq_f16(a, b, c, 7);			return vfmaq_laneq_f16(a, b, c, 7);
	}			}

	// COMMON-LABEL: test_vfma_n_f16			// COMMON-LABEL: test_vfma_n_f16
	// COMMONIR: [[TMP0:%.*]] = insertelement <4 x half> undef, half %c, i32 0			// COMMONIR: [[TMP0:%.*]] = insertelement <4 x half> undef, half %c, i32 0
	// COMMONIR: [[TMP1:%.*]] = insertelement <4 x half> [[TMP0]], half %c, i32 1			// COMMONIR: [[TMP1:%.*]] = insertelement <4 x half> [[TMP0]], half %c, i32 1
	// COMMONIR: [[TMP2:%.*]] = insertelement <4 x half> [[TMP1]], half %c, i32 2			// COMMONIR: [[TMP2:%.*]] = insertelement <4 x half> [[TMP1]], half %c, i32 2
	// COMMONIR: [[TMP3:%.*]] = insertelement <4 x half> [[TMP2]], half %c, i32 3			// COMMONIR: [[TMP3:%.*]] = insertelement <4 x half> [[TMP2]], half %c, i32 3
	// UNCONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> %b, <4 x half> [[TMP3]], <4 x half> %a)			// UNCONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> %b, <4 x half> [[TMP3]], <4 x half> %a)
	// CONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> %b, <4 x half> [[TMP3]], <4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> %b, <4 x half> [[TMP3]], <4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <4 x half> [[FMA]]			// COMMONIR: ret <4 x half> [[FMA]]
	float16x4_t test_vfma_n_f16(float16x4_t a, float16x4_t b, float16_t c) {			float16x4_t test_vfma_n_f16(float16x4_t a, float16x4_t b, float16_t c) {
	return vfma_n_f16(a, b, c);			return vfma_n_f16(a, b, c);
	}			}

	// COMMON-LABEL: test_vfmaq_n_f16			// COMMON-LABEL: test_vfmaq_n_f16
	// COMMONIR: [[TMP0:%.*]] = insertelement <8 x half> undef, half %c, i32 0			// COMMONIR: [[TMP0:%.*]] = insertelement <8 x half> undef, half %c, i32 0
	// COMMONIR: [[TMP1:%.*]] = insertelement <8 x half> [[TMP0]], half %c, i32 1			// COMMONIR: [[TMP1:%.*]] = insertelement <8 x half> [[TMP0]], half %c, i32 1
	// COMMONIR: [[TMP2:%.*]] = insertelement <8 x half> [[TMP1]], half %c, i32 2			// COMMONIR: [[TMP2:%.*]] = insertelement <8 x half> [[TMP1]], half %c, i32 2
	// COMMONIR: [[TMP3:%.*]] = insertelement <8 x half> [[TMP2]], half %c, i32 3			// COMMONIR: [[TMP3:%.*]] = insertelement <8 x half> [[TMP2]], half %c, i32 3
	// COMMONIR: [[TMP4:%.*]] = insertelement <8 x half> [[TMP3]], half %c, i32 4			// COMMONIR: [[TMP4:%.*]] = insertelement <8 x half> [[TMP3]], half %c, i32 4
	// COMMONIR: [[TMP5:%.*]] = insertelement <8 x half> [[TMP4]], half %c, i32 5			// COMMONIR: [[TMP5:%.*]] = insertelement <8 x half> [[TMP4]], half %c, i32 5
	// COMMONIR: [[TMP6:%.*]] = insertelement <8 x half> [[TMP5]], half %c, i32 6			// COMMONIR: [[TMP6:%.*]] = insertelement <8 x half> [[TMP5]], half %c, i32 6
	// COMMONIR: [[TMP7:%.*]] = insertelement <8 x half> [[TMP6]], half %c, i32 7			// COMMONIR: [[TMP7:%.*]] = insertelement <8 x half> [[TMP6]], half %c, i32 7
	// UNCONSTRAINED: [[FMA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> %b, <8 x half> [[TMP7]], <8 x half> %a)			// UNCONSTRAINED: [[FMA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> %b, <8 x half> [[TMP7]], <8 x half> %a)
	// CONSTRAINED: [[FMA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> %b, <8 x half> [[TMP7]], <8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> %b, <8 x half> [[TMP7]], <8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <8 x half> [[FMA]]			// COMMONIR: ret <8 x half> [[FMA]]
	float16x8_t test_vfmaq_n_f16(float16x8_t a, float16x8_t b, float16_t c) {			float16x8_t test_vfmaq_n_f16(float16x8_t a, float16x8_t b, float16_t c) {
	return vfmaq_n_f16(a, b, c);			return vfmaq_n_f16(a, b, c);
	}			}

	// COMMON-LABEL: test_vfmah_lane_f16			// COMMON-LABEL: test_vfmah_lane_f16
	// COMMONIR: [[EXTR:%.*]] = extractelement <4 x half> %c, i32 3			// COMMONIR: [[EXTR:%.*]] = extractelement <4 x half> %c, i32 3
	// UNCONSTRAINED: [[FMA:%.*]] = call half @llvm.fma.f16(half %b, half [[EXTR]], half %a)			// UNCONSTRAINED: [[FMA:%.*]] = call half @llvm.fma.f16(half %b, half [[EXTR]], half %a)
	// CONSTRAINED: [[FMA:%.*]] = call half @llvm.experimental.constrained.fma.f16(half %b, half [[EXTR]], half %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call half @llvm.experimental.constrained.fma.f16(half %b, half [[EXTR]], half %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla h{{[0-9]+}}, h{{[0-9]+}}, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla h{{[0-9]+}}, h{{[0-9]+}}, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret half [[FMA]]			// COMMONIR: ret half [[FMA]]
	float16_t test_vfmah_lane_f16(float16_t a, float16_t b, float16x4_t c) {			float16_t test_vfmah_lane_f16(float16_t a, float16_t b, float16x4_t c) {
	return vfmah_lane_f16(a, b, c, 3);			return vfmah_lane_f16(a, b, c, 3);
	}			}

	// COMMON-LABEL: test_vfmah_laneq_f16			// COMMON-LABEL: test_vfmah_laneq_f16
	// COMMONIR: [[EXTR:%.*]] = extractelement <8 x half> %c, i32 7			// COMMONIR: [[EXTR:%.*]] = extractelement <8 x half> %c, i32 7
	// UNCONSTRAINED: [[FMA:%.*]] = call half @llvm.fma.f16(half %b, half [[EXTR]], half %a)			// UNCONSTRAINED: [[FMA:%.*]] = call half @llvm.fma.f16(half %b, half [[EXTR]], half %a)
	// CONSTRAINED: [[FMA:%.*]] = call half @llvm.experimental.constrained.fma.f16(half %b, half [[EXTR]], half %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call half @llvm.experimental.constrained.fma.f16(half %b, half [[EXTR]], half %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla h{{[0-9]+}}, h{{[0-9]+}}, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla h{{[0-9]+}}, h{{[0-9]+}}, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret half [[FMA]]			// COMMONIR: ret half [[FMA]]
	float16_t test_vfmah_laneq_f16(float16_t a, float16_t b, float16x8_t c) {			float16_t test_vfmah_laneq_f16(float16_t a, float16_t b, float16x8_t c) {
	return vfmah_laneq_f16(a, b, c, 7);			return vfmah_laneq_f16(a, b, c, 7);
	}			}

	// COMMON-LABEL: test_vfms_lane_f16			// COMMON-LABEL: test_vfms_lane_f16
	// COMMONIR: [[SUB:%.*]] = fneg <4 x half> %b			// COMMONIR: [[SUB:%.*]] = fneg <4 x half> %b
	// COMMONIR: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>			// COMMONIR: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>
	// COMMONIR: [[TMP1:%.*]] = bitcast <4 x half> [[SUB]] to <8 x i8>			// COMMONIR: [[TMP1:%.*]] = bitcast <4 x half> [[SUB]] to <8 x i8>
	// COMMONIR: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>			// COMMONIR: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>
	// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>			// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>
	// COMMONIR: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>			// COMMONIR: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>
	// COMMONIR: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>			// COMMONIR: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>
	// COMMONIR: [[TMP5:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>			// COMMONIR: [[TMP5:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>
	// UNCONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]])			// UNCONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]])
	// CONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]], metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmls v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmls v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <4 x half> [[FMA]]			// COMMONIR: ret <4 x half> [[FMA]]
	float16x4_t test_vfms_lane_f16(float16x4_t a, float16x4_t b, float16x4_t c) {			float16x4_t test_vfms_lane_f16(float16x4_t a, float16x4_t b, float16x4_t c) {
	return vfms_lane_f16(a, b, c, 3);			return vfms_lane_f16(a, b, c, 3);
	}			}

	// COMMON-LABEL: test_vfmsq_lane_f16			// COMMON-LABEL: test_vfmsq_lane_f16
	// COMMONIR: [[SUB:%.*]] = fneg <8 x half> %b			// COMMONIR: [[SUB:%.*]] = fneg <8 x half> %b
	// COMMONIR: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>			// COMMONIR: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>
	// COMMONIR: [[TMP1:%.*]] = bitcast <8 x half> [[SUB]] to <16 x i8>			// COMMONIR: [[TMP1:%.*]] = bitcast <8 x half> [[SUB]] to <16 x i8>
	// COMMONIR: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>			// COMMONIR: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>
	// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>			// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>
	// COMMONIR: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>			// COMMONIR: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>
	// COMMONIR: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>			// COMMONIR: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>
	// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>			// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>
	// UNCONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]])			// UNCONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]])
	// CONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]], metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmls v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmls v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <8 x half> [[FMLA]]			// COMMONIR: ret <8 x half> [[FMLA]]
	float16x8_t test_vfmsq_lane_f16(float16x8_t a, float16x8_t b, float16x4_t c) {			float16x8_t test_vfmsq_lane_f16(float16x8_t a, float16x8_t b, float16x4_t c) {
	return vfmsq_lane_f16(a, b, c, 3);			return vfmsq_lane_f16(a, b, c, 3);
	}			}

	// COMMON-LABEL: test_vfms_laneq_f16			// COMMON-LABEL: test_vfms_laneq_f16
	// COMMONIR: [[SUB:%.*]] = fneg <4 x half> %b			// COMMONIR: [[SUB:%.*]] = fneg <4 x half> %b
	// CHECK-ASM-NOT: fneg			// CHECK-ASM-NOT: fneg
	// COMMONIR: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>			// COMMONIR: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>
	// COMMONIR: [[TMP1:%.*]] = bitcast <4 x half> [[SUB]] to <8 x i8>			// COMMONIR: [[TMP1:%.*]] = bitcast <4 x half> [[SUB]] to <8 x i8>
	// COMMONIR: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>			// COMMONIR: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>
	// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>			// COMMONIR: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>
	// COMMONIR: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>			// COMMONIR: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>
	// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>			// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>
	// COMMONIR: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <4 x i32> <i32 7, i32 7, i32 7, i32 7>			// COMMONIR: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <4 x i32> <i32 7, i32 7, i32 7, i32 7>
	// UNCONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]])			// UNCONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]])
	// CONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]], metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMLA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmls v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmls v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <4 x half> [[FMLA]]			// COMMONIR: ret <4 x half> [[FMLA]]
	float16x4_t test_vfms_laneq_f16(float16x4_t a, float16x4_t b, float16x8_t c) {			float16x4_t test_vfms_laneq_f16(float16x4_t a, float16x4_t b, float16x8_t c) {
	return vfms_laneq_f16(a, b, c, 7);			return vfms_laneq_f16(a, b, c, 7);
	}			}

	// COMMON-LABEL: test_vfmsq_laneq_f16			// COMMON-LABEL: test_vfmsq_laneq_f16
	// COMMONIR: [[SUB:%.*]] = fneg <8 x half> %b			// COMMONIR: [[SUB:%.*]] = fneg <8 x half> %b
	// CHECK-ASM-NOT: fneg			// CHECK-ASM-NOT: fneg
	// COMMONIR: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>			// COMMONIR: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>
	// COMMONIR: [[TMP1:%.*]] = bitcast <8 x half> [[SUB]] to <16 x i8>			// COMMONIR: [[TMP1:%.*]] = bitcast <8 x half> [[SUB]] to <16 x i8>
	// COMMONIR: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>			// COMMONIR: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>
	// COMMONIR: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>			// COMMONIR: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>
	// COMMONIR: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>			// COMMONIR: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>
	// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>			// COMMONIR: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>
	// COMMONIR: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <8 x i32> <i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7>			// COMMONIR: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <8 x i32> <i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7>
	// UNCONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]])			// UNCONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]])
	// CONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]], metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMLA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmls v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmls v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <8 x half> [[FMLA]]			// COMMONIR: ret <8 x half> [[FMLA]]
	float16x8_t test_vfmsq_laneq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmsq_laneq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmsq_laneq_f16(a, b, c, 7);			return vfmsq_laneq_f16(a, b, c, 7);
	}			}

	// COMMON-LABEL: test_vfms_n_f16			// COMMON-LABEL: test_vfms_n_f16
	// COMMONIR: [[SUB:%.*]] = fneg <4 x half> %b			// COMMONIR: [[SUB:%.*]] = fneg <4 x half> %b
	// COMMONIR: [[TMP0:%.*]] = insertelement <4 x half> undef, half %c, i32 0			// COMMONIR: [[TMP0:%.*]] = insertelement <4 x half> undef, half %c, i32 0
	// COMMONIR: [[TMP1:%.*]] = insertelement <4 x half> [[TMP0]], half %c, i32 1			// COMMONIR: [[TMP1:%.*]] = insertelement <4 x half> [[TMP0]], half %c, i32 1
	// COMMONIR: [[TMP2:%.*]] = insertelement <4 x half> [[TMP1]], half %c, i32 2			// COMMONIR: [[TMP2:%.*]] = insertelement <4 x half> [[TMP1]], half %c, i32 2
	// COMMONIR: [[TMP3:%.*]] = insertelement <4 x half> [[TMP2]], half %c, i32 3			// COMMONIR: [[TMP3:%.*]] = insertelement <4 x half> [[TMP2]], half %c, i32 3
	// UNCONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> [[TMP3]], <4 x half> %a)			// UNCONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> [[TMP3]], <4 x half> %a)
	// CONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[SUB]], <4 x half> [[TMP3]], <4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call <4 x half> @llvm.experimental.constrained.fma.v4f16(<4 x half> [[SUB]], <4 x half> [[TMP3]], <4 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmls v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmls v{{[0-9]+}}.4h, v{{[0-9]+}}.4h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <4 x half> [[FMA]]			// COMMONIR: ret <4 x half> [[FMA]]
	float16x4_t test_vfms_n_f16(float16x4_t a, float16x4_t b, float16_t c) {			float16x4_t test_vfms_n_f16(float16x4_t a, float16x4_t b, float16_t c) {
	return vfms_n_f16(a, b, c);			return vfms_n_f16(a, b, c);
	}			}

	// COMMON-LABEL: test_vfmsq_n_f16			// COMMON-LABEL: test_vfmsq_n_f16
	// COMMONIR: [[SUB:%.*]] = fneg <8 x half> %b			// COMMONIR: [[SUB:%.*]] = fneg <8 x half> %b
	// COMMONIR: [[TMP0:%.*]] = insertelement <8 x half> undef, half %c, i32 0			// COMMONIR: [[TMP0:%.*]] = insertelement <8 x half> undef, half %c, i32 0
	// COMMONIR: [[TMP1:%.*]] = insertelement <8 x half> [[TMP0]], half %c, i32 1			// COMMONIR: [[TMP1:%.*]] = insertelement <8 x half> [[TMP0]], half %c, i32 1
	// COMMONIR: [[TMP2:%.*]] = insertelement <8 x half> [[TMP1]], half %c, i32 2			// COMMONIR: [[TMP2:%.*]] = insertelement <8 x half> [[TMP1]], half %c, i32 2
	// COMMONIR: [[TMP3:%.*]] = insertelement <8 x half> [[TMP2]], half %c, i32 3			// COMMONIR: [[TMP3:%.*]] = insertelement <8 x half> [[TMP2]], half %c, i32 3
	// COMMONIR: [[TMP4:%.*]] = insertelement <8 x half> [[TMP3]], half %c, i32 4			// COMMONIR: [[TMP4:%.*]] = insertelement <8 x half> [[TMP3]], half %c, i32 4
	// COMMONIR: [[TMP5:%.*]] = insertelement <8 x half> [[TMP4]], half %c, i32 5			// COMMONIR: [[TMP5:%.*]] = insertelement <8 x half> [[TMP4]], half %c, i32 5
	// COMMONIR: [[TMP6:%.*]] = insertelement <8 x half> [[TMP5]], half %c, i32 6			// COMMONIR: [[TMP6:%.*]] = insertelement <8 x half> [[TMP5]], half %c, i32 6
	// COMMONIR: [[TMP7:%.*]] = insertelement <8 x half> [[TMP6]], half %c, i32 7			// COMMONIR: [[TMP7:%.*]] = insertelement <8 x half> [[TMP6]], half %c, i32 7
	// UNCONSTRAINED: [[FMA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> [[TMP7]], <8 x half> %a)			// UNCONSTRAINED: [[FMA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> [[TMP7]], <8 x half> %a)
	// CONSTRAINED: [[FMA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[SUB]], <8 x half> [[TMP7]], <8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call <8 x half> @llvm.experimental.constrained.fma.v8f16(<8 x half> [[SUB]], <8 x half> [[TMP7]], <8 x half> %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmls v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmls v{{[0-9]+}}.8h, v{{[0-9]+}}.8h, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret <8 x half> [[FMA]]			// COMMONIR: ret <8 x half> [[FMA]]
	float16x8_t test_vfmsq_n_f16(float16x8_t a, float16x8_t b, float16_t c) {			float16x8_t test_vfmsq_n_f16(float16x8_t a, float16x8_t b, float16_t c) {
	return vfmsq_n_f16(a, b, c);			return vfmsq_n_f16(a, b, c);
	}			}

	// COMMON-LABEL: test_vfmsh_lane_f16			// COMMON-LABEL: test_vfmsh_lane_f16
	// UNCONSTRAINED: [[TMP0:%.*]] = fpext half %b to float			// UNCONSTRAINED: [[TMP0:%.*]] = fpext half %b to float
	// CONSTRAINED: [[TMP0:%.*]] = call float @llvm.experimental.constrained.fpext.f32.f16(half %b, metadata !"fpexcept.strict")			// CONSTRAINED: [[TMP0:%.*]] = call float @llvm.experimental.constrained.fpext.f32.f16(half %b, metadata !"fpexcept.strict")
	// CHECK-ASM: fcvt s{{[0-9]+}}, h{{[0-9]+}}			// CHECK-ASM: fcvt s{{[0-9]+}}, h{{[0-9]+}}
	// COMMONIR: [[TMP1:%.*]] = fneg float [[TMP0]]			// COMMONIR: [[TMP1:%.*]] = fneg float [[TMP0]]
	// CHECK-ASM: fneg s{{[0-9]+}}, s{{[0-9]+}}			// CHECK-ASM: fneg s{{[0-9]+}}, s{{[0-9]+}}
	// UNCONSTRAINED: [[SUB:%.*]] = fptrunc float [[TMP1]] to half			// UNCONSTRAINED: [[SUB:%.*]] = fptrunc float [[TMP1]] to half
	// CONSTRAINED: [[SUB:%.*]] = call half @llvm.experimental.constrained.fptrunc.f16.f32(float [[TMP1]], metadata !"round.tonearest", metadata !"fpexcept.strict")			// CONSTRAINED: [[SUB:%.*]] = call half @llvm.experimental.constrained.fptrunc.f16.f32(float [[TMP1]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fcvt h{{[0-9]+}}, s{{[0-9]+}}			// CHECK-ASM: fcvt h{{[0-9]+}}, s{{[0-9]+}}
	// COMMONIR: [[EXTR:%.*]] = extractelement <4 x half> %c, i32 3			// COMMONIR: [[EXTR:%.*]] = extractelement <4 x half> %c, i32 3
	// UNCONSTRAINED: [[FMA:%.*]] = call half @llvm.fma.f16(half [[SUB]], half [[EXTR]], half %a)			// UNCONSTRAINED: [[FMA:%.*]] = call half @llvm.fma.f16(half [[SUB]], half [[EXTR]], half %a)
	// CONSTRAINED: [[FMA:%.*]] = call half @llvm.experimental.constrained.fma.f16(half [[SUB]], half [[EXTR]], half %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call half @llvm.experimental.constrained.fma.f16(half [[SUB]], half [[EXTR]], half %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla h{{[0-9]+}}, h{{[0-9]+}}, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla h{{[0-9]+}}, h{{[0-9]+}}, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret half [[FMA]]			// COMMONIR: ret half [[FMA]]
	float16_t test_vfmsh_lane_f16(float16_t a, float16_t b, float16x4_t c) {			float16_t test_vfmsh_lane_f16(float16_t a, float16_t b, float16x4_t c) {
	return vfmsh_lane_f16(a, b, c, 3);			return vfmsh_lane_f16(a, b, c, 3);
	}			}

	// COMMON-LABEL: test_vfmsh_laneq_f16			// COMMON-LABEL: test_vfmsh_laneq_f16
	// UNCONSTRAINED: [[TMP0:%.*]] = fpext half %b to float			// UNCONSTRAINED: [[TMP0:%.*]] = fpext half %b to float
	// CONSTRAINED: [[TMP0:%.*]] = call float @llvm.experimental.constrained.fpext.f32.f16(half %b, metadata !"fpexcept.strict")			// CONSTRAINED: [[TMP0:%.*]] = call float @llvm.experimental.constrained.fpext.f32.f16(half %b, metadata !"fpexcept.strict")
	// CHECK-ASM: fcvt s{{[0-9]+}}, h{{[0-9]+}}			// CHECK-ASM: fcvt s{{[0-9]+}}, h{{[0-9]+}}
	// COMMONIR: [[TMP1:%.*]] = fneg float [[TMP0]]			// COMMONIR: [[TMP1:%.*]] = fneg float [[TMP0]]
	// CHECK-ASM: fneg s{{[0-9]+}}, s{{[0-9]+}}			// CHECK-ASM: fneg s{{[0-9]+}}, s{{[0-9]+}}
	// UNCONSTRAINED: [[SUB:%.*]] = fptrunc float [[TMP1]] to half			// UNCONSTRAINED: [[SUB:%.*]] = fptrunc float [[TMP1]] to half
	// CONSTRAINED: [[SUB:%.*]] = call half @llvm.experimental.constrained.fptrunc.f16.f32(float [[TMP1]], metadata !"round.tonearest", metadata !"fpexcept.strict")			// CONSTRAINED: [[SUB:%.*]] = call half @llvm.experimental.constrained.fptrunc.f16.f32(float [[TMP1]], metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fcvt h{{[0-9]+}}, s{{[0-9]+}}			// CHECK-ASM: fcvt h{{[0-9]+}}, s{{[0-9]+}}
	// COMMONIR: [[EXTR:%.*]] = extractelement <8 x half> %c, i32 7			// COMMONIR: [[EXTR:%.*]] = extractelement <8 x half> %c, i32 7
	// UNCONSTRAINED: [[FMA:%.*]] = call half @llvm.fma.f16(half [[SUB]], half [[EXTR]], half %a)			// UNCONSTRAINED: [[FMA:%.*]] = call half @llvm.fma.f16(half [[SUB]], half [[EXTR]], half %a)
	// CONSTRAINED: [[FMA:%.*]] = call half @llvm.experimental.constrained.fma.f16(half [[SUB]], half [[EXTR]], half %a, metadata !"round.tonearest", metadata !"fpexcept.maytrap")			// CONSTRAINED: [[FMA:%.*]] = call half @llvm.experimental.constrained.fma.f16(half [[SUB]], half [[EXTR]], half %a, metadata !"round.tonearest", metadata !"fpexcept.strict")
	// CHECK-ASM: fmla h{{[0-9]+}}, h{{[0-9]+}}, v{{[0-9]+}}.h[{{[0-9]+}}]			// CHECK-ASM: fmla h{{[0-9]+}}, h{{[0-9]+}}, v{{[0-9]+}}.h[{{[0-9]+}}]
	// COMMONIR: ret half [[FMA]]			// COMMONIR: ret half [[FMA]]
	float16_t test_vfmsh_laneq_f16(float16_t a, float16_t b, float16x8_t c) {			float16_t test_vfmsh_laneq_f16(float16_t a, float16_t b, float16x8_t c) {
	return vfmsh_laneq_f16(a, b, c, 7);			return vfmsh_laneq_f16(a, b, c, 7);
	}			}

clang/test/CodeGen/complex-strictfp.c

// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py		// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// RUN: %clang_cc1 -no-opaque-pointers -triple x86_64-unknown-unknown -ffp-exception-behavior=maytrap -emit-llvm -o - %s \| FileCheck %s		// RUN: %clang_cc1 -no-opaque-pointers -triple x86_64-unknown-unknown -ffp-exception-behavior=maytrap -emit-llvm -o - %s \| FileCheck %s


// Test that the constrained intrinsics are picking up the exception		// Test that the constrained intrinsics are picking up the exception
// metadata from the AST instead of the global default from the command line.		// metadata from the AST instead of the global default from the command line.
// Include rounding metadata in the testing.		// Include rounding metadata in the testing.
// FIXME: All cases of "fpexcept.maytrap" in this test are wrong.
// FIXME: All cases of "round.tonearest" in this test are wrong.

#pragma float_control(except, on)		#pragma float_control(except, on)
#pragma STDC FENV_ROUND FE_UPWARD		#pragma STDC FENV_ROUND FE_UPWARD

_Complex double g1, g2;		_Complex double g1, g2;
_Complex float cf;		_Complex float cf;
double D;		double D;

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	void test3c(void) {
cf /= g1;		cf /= g1;
}		}

// CHECK-LABEL: @test3d(		// CHECK-LABEL: @test3d(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[G1_REAL:%.]] = load double, double getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 0), align 8		// CHECK-NEXT: [[G1_REAL:%.]] = load double, double getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 0), align 8
// CHECK-NEXT: [[G1_IMAG:%.]] = load double, double getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 1), align 8		// CHECK-NEXT: [[G1_IMAG:%.]] = load double, double getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 1), align 8
// CHECK-NEXT: [[TMP0:%.]] = load double, double @D, align 8		// CHECK-NEXT: [[TMP0:%.]] = load double, double @D, align 8
// CHECK-NEXT: [[ADD_R:%.*]] = call double @llvm.experimental.constrained.fadd.f64(double [[G1_REAL]], double [[TMP0]], metadata !"round.tonearest", metadata !"fpexcept.maytrap") #[[ATTR2]]		// CHECK-NEXT: [[ADD_R:%.*]] = call double @llvm.experimental.constrained.fadd.f64(double [[G1_REAL]], double [[TMP0]], metadata !"round.upward", metadata !"fpexcept.strict") #[[ATTR2]]
// CHECK-NEXT: store double [[ADD_R]], double* getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 0), align 8		// CHECK-NEXT: store double [[ADD_R]], double* getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 0), align 8
// CHECK-NEXT: store double [[G1_IMAG]], double* getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 1), align 8		// CHECK-NEXT: store double [[G1_IMAG]], double* getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 1), align 8
// CHECK-NEXT: ret void		// CHECK-NEXT: ret void
//		//
void test3d(void) {		void test3d(void) {
g1 = g1 + D;		g1 = g1 + D;
}		}

// CHECK-LABEL: @test3e(		// CHECK-LABEL: @test3e(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = load double, double @D, align 8		// CHECK-NEXT: [[TMP0:%.]] = load double, double @D, align 8
// CHECK-NEXT: [[G1_REAL:%.]] = load double, double getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 0), align 8		// CHECK-NEXT: [[G1_REAL:%.]] = load double, double getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 0), align 8
// CHECK-NEXT: [[G1_IMAG:%.]] = load double, double getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 1), align 8		// CHECK-NEXT: [[G1_IMAG:%.]] = load double, double getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 1), align 8
// CHECK-NEXT: [[ADD_R:%.*]] = call double @llvm.experimental.constrained.fadd.f64(double [[TMP0]], double [[G1_REAL]], metadata !"round.tonearest", metadata !"fpexcept.maytrap") #[[ATTR2]]		// CHECK-NEXT: [[ADD_R:%.*]] = call double @llvm.experimental.constrained.fadd.f64(double [[TMP0]], double [[G1_REAL]], metadata !"round.upward", metadata !"fpexcept.strict") #[[ATTR2]]
// CHECK-NEXT: store double [[ADD_R]], double* getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 0), align 8		// CHECK-NEXT: store double [[ADD_R]], double* getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 0), align 8
// CHECK-NEXT: store double [[G1_IMAG]], double* getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 1), align 8		// CHECK-NEXT: store double [[G1_IMAG]], double* getelementptr inbounds ({ double, double }, { double, double }* @g1, i32 0, i32 1), align 8
// CHECK-NEXT: ret void		// CHECK-NEXT: ret void
//		//
void test3e(void) {		void test3e(void) {
g1 = D + g1;		g1 = D + g1;
}		}

▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

clang/test/CodeGen/pragma-fenv_access.cpp

This file was added.

				// RUN: %clang_cc1 -S -triple x86_64-linux-gnu -emit-llvm %s -o - \| FileCheck %s

				template <typename T>
				T templ_01(T a, T b) {
				return a * b;
				}

				#pragma STDC FENV_ACCESS ON

				float func_02(float a, float b) {
				return 1.0f + templ_01<float>(a, b);
				}

				// CHECK-LABEL: define {{.*}} @_Z7func_02ff
				// CHECK: call noundef float @_Z8templ_01IfET_S0_S0_
				// CHECK: call float @llvm.experimental.constrained.fadd.f32({{.*}}, metadata !"round.dynamic", metadata !"fpexcept.strict")

				// CHECK-LABEL: define {{.*}} @_Z8templ_01IfET_S0_S0_
				// CHECK: fmul float


				#pragma STDC FENV_ACCESS OFF
				#pragma STDC FENV_ROUND FE_UPWARD

				template <typename T>
				T templ_03(T a, T b) {
				return a * b;
				}

				#pragma STDC FENV_ACCESS ON

				float func_04(float a, float b) {
				return 1.0f + templ_03<float>(a, b);
				}


				aaron.ballmanUnsubmitted Not Done Reply Inline Actions There are some extra test cases I'd like to see coverage for because there are some interesting edge cases to consider. template <typename Ty> float func1(Ty) { float f1 = 1.0f, f2 = 3.0f; return f1 + f2 * 2.0f; } #pragma float_control(precise, on, push) template float func1<int>(int); #pragma float_control(pop) #pragma float_control(precise, on, push) template <typename Ty> float func2(Ty) { float f1 = 1.0f, f2 = 3.0f; return f1 + f2 * 2.0f; } #pragma float_control(pop) template float func2<int>(int); void bar() { func1(1.1); func2(1.1); } This gets especially interesting when you think about delayed template instantiation as happens by default on Windows targets. Consider this code with the driver level `-ffast-math` flag enabled (not the cc1 option, which is different). I think that `func1<int>` SHOULD be precise, because the explicit instantiation is, while `func1<double>` SHOULD NOT be precise, because the definition is not. `func2<int>` SHOULD NOT be precise, because the explicit instantiation is not, while `func2<double>` SHOULD be precise, because the definition is. Partial specializations are a similar situation where the primary template and its related code made have different options. WDYT? aaron.ballman: There are some extra test cases I'd like to see coverage for because there are some interesting…
				sepavloffAuthorUnsubmitted Done Reply Inline Actions Standard FP pragmas are defined only in C standard, so interaction of them with C++ specific features is actually implementation-defined. The cases presented in your example are reasonable solutions with one exception: IMO `func2<int>` should be precise, because its template is precise. It is equivalent to: template <typename Ty> float func2(Ty) { #pragma float_control(precise, on) float f1 = 1.0f, f2 = 3.0f; return f1 + f2 * 2.0f; } so instantiation of it would produce function with precise operations. Implementation of correct mechanism of the interaction requires substantial efforts and should be made in a separate patch, I think. In particular, we need to invent a way to associate a point of instantiation with the FPOptions in that point, so that delayed instantiation could be made with correct set of options. In this patch the change in SemaTemplateInstantiateDecl.cpp prevents from compiler crash. Without it codegen tries to create a call to constrained intrinsic in the function that do not have attribute StrictFP, because flag FEnvAccess is set at the end of translation unit in `pragma-fenv_access.cpp`. sepavloff: Standard FP pragmas are defined only in C standard, so interaction of them with C++ specific…
				// CHECK-LABEL: define {{.*}} @_Z7func_04ff
				// CHECK: call noundef float @_Z8templ_03IfET_S0_S0_
				// CHECK: call float @llvm.experimental.constrained.fadd.f32({{.*}}, metadata !"round.upward", metadata !"fpexcept.strict")

				// CHECK-LABEL: define {{.*}} @_Z8templ_03IfET_S0_S0_
				// CHECK: call float @llvm.experimental.constrained.fmul.f32({{.*}}, metadata !"round.upward", metadata !"fpexcept.ignore")